Background Features in Google Meet, Powered by Web ML

dharma1 · on Nov 2, 2020

Sony released some software a couple of months ago that lets you use most of their DSLRs as webcams with USB. My goodness, paired with a fast lens, what a difference to my MacBook webcam, even with these ml blurred backgrounds!

It's only 720p and around 15fps but real shallow dof, very little sensor noise, autofocus works. Well worth trying if you have a Sony camera from the last few years.

Sensor size and good optics still wins. Having said that,the effort and detail gone into this feature is very impressive, enjoyed the blog post. Also webassembly SIMD looks super cool, looking forward to a new class of webapps using wasm.

piquadrat · on Nov 2, 2020

I recently tried to get a setup similar to this with a Fujifilm X-T20 I had lying around, remembering that Fujifilm announced similar software. Alas, that software only works with their higher end models.

I ended up getting a $10 HDMI USB capture stick from Aliexpress. I get a perfect 1080p/60fps signal, and at least on Linux it worked out of the box with Zoom.

The only problem now is that most of my meetings start with "wow, why do you look like you're on TV?"

notimetorelax · on Nov 2, 2020

This is interesting, could you link to the device you used? Do you know if it would work on Chrome OS?

piquadrat · on Nov 2, 2020

I got this one: https://aliexpress.com/item/4000917130635.html

They are also available on Amazon under a bunch of different brands, e.g. https://www.amazon.com/LENCENT-1080p60fps-Definition-Acquisi...

It's recognized as UVC (USB Video Class) device. Can't say for sure if Chrome OS can handle those.

kenhwang · on Nov 2, 2020

Canon did too! Definitely a huge upgrade over a typical webcam.

I'm using my old T1i which can be had for less than $50 these days, plus you can pick up a 18-55mm kit lens for like $20 and the video quality blows away any webcam, especially for the same price. Also recommend battery->power adapter.

callmeal · on Nov 2, 2020

The T1i is not listed on their compatibility page. Which version of the utility did you get for it?

kenhwang · on Nov 2, 2020

Latest version of both. I figured any of the video recording capable DSLRs likely would work fine and gave it a shot.

polpo · on Nov 2, 2020

I have a T2i that I use with the EOS Webcam Utility and it works just fine despite not being listed as compatible.

silly-silly · on Nov 2, 2020

What is the canon software called ?

swang · on Nov 2, 2020

EOS Webcam Utility https://www.usa.canon.com/internet/portal/us/home/support/se...

jmarcher · on Nov 2, 2020

Canon and Nikon do too. In practice, the quality bump is nice, but we are still talking of a fairly low res/bit rate when it gets through Zoom so the end result is fairly underwhelming. As far as what the other people see on their wnd.

dharma1 · on Nov 2, 2020

Yeah.. both Zoom and Google Meet have >720p video but the bitrate especially on Zoom is a travesty, 600kbps/1.2mbps stream with all the different resolutions in the same stream.

The codec situation with h264/HEVC/vp9/AV1 software/hardware encoding is a mess. Hopefully we'll get wide hardware support for AV1, although it might take a while.

Quarrel · on Nov 2, 2020

Woot. Thanks for pointing this out - I looked for a solution a while back and it seemed like I had to get a separate capture card to connect my Sony DSLR. Will go check this out now.

(I ended up having to buy a little logitech webcam, which has been fine, but being able to pick my lens etc is awesome!)

brainless · on Nov 2, 2020

I use my Android (Redmi Note 8 Pro) primary cam (720p I think) using Droidcam and it works like a charm on Linux.

I also tried gPhoto2/ffmpeg and virtual cam driver with Nikon D5200 (USB) on Linux but I prefer the Redmi since I do not have a decent low light lens for my DSLR.

tsycho · on Nov 2, 2020

Having used both Zoom and Meet extensively now for the past 6 months, my experience is:

1/ Your internet connection, especially upload bandwidth and latency matter a lot.

2/ Zoom's desktop app performs very well, but its web version is atrocious. Not just because of the dark patterns they use to force you to install the desktop app, but also its performance is terrible compared to its desktop version, as well as worse than almost everything else. Unfortunately, I don't trust them and refuse to use their desktop app on anything but my iPad.

3/ Meet used to be bad like Zoom on web 6 months ago, but has improved a lot and is slowly approaching Zoom desktop in performance. I have noticed that Meet on my work GSuite calls at work perform much better than on my personal account. This might be explained by #1 above I.e. my family has worse internet connections than my coworkers, but I am not sure if all improvements have been rolled out to personal accounts.

wffurr · on Nov 2, 2020

> 1/ Your internet connection, especially upload bandwidth and latency matter a lot.

I moved to a new house, and the quality of my video calls dropped dramatically. Constant freezing and dropouts. It was extremely frustrating to try to participate in a meeting. I could receive fine, but anytime I spoke out, I would drop out within minutes.

Speed tests showed plenty of bandwidth, but my modem statistics showed high upstream power levels, occasionally out of the allowed range, and lots of "uncorrectable" packets.

I finally got a Comcast technician in to look at it (yay for business-class support), and they replaced the cable from the pole all the way to the first splitter in the basement, and since then it's been flawless. 100/15 Megabit service has been totally adequate for our needs, so long as it's reliable and the latency is low enough.

It kills me that our city isn't putting in conduits or fiber while doing utility work, though. The whole time that was happening, there were gas contractors opening the street and running new supply lines to every house, but not putting in any extra conduits or dark fiber. The construction sounds were almost like being back in the office...

Ntrails · on Nov 2, 2020

Today I took a flawless webex meeting on a laptop tethered to my mobile phone, that same tether also allowed me to work without issue over rdp or whatever.

My mobile internet is really fucking good, and often outperforms my sodding wired connection

nojvek · on Nov 2, 2020

I've had great experience with T-Mobile 4G. It outperforms my wired Frontier connection in terms of both up/down speeds. Although it has been getting Spotty lately. During peak hours the speed drops significantly.

wffurr · on Nov 2, 2020

Dial in for voice or use just using T-Mobile 4G was my fallback on bad days. Worked great until I hit my data cap...

lotsofpulp · on Nov 2, 2020

>1/ Your internet connection, especially upload bandwidth and latency matter a lot.

It grates me when people claim DSL/cable qualifies as sufficiently good broadband in the US because of the lack of upload bandwidth and slow latency (can add packet loss in here too). The situation is so bad that you can't even find how much upload bandwidth so called "broadband" cable ISPs offer.

The experience on symmetric fiber connections is noticeably improved, and we can have a house with a whole group of people streaming video up and down simultaneously without a hiccup. Such as in times of work from home and school from home.

boulos · on Nov 2, 2020

Disclosure: I work on Google Cloud (but not Meet).

For the last item, personal accounts (only?) default to send and receive video at lower resolution (360p). So if you meant that the quality is lower, you can set it on both sides to 720p.

Edit: I don’t think Meet remembers those settings though, so you have to do it every time (and show your family members how to do so).

kevincox · on Nov 2, 2020

It doesn't remember them, it is frustrating.

As a legacy free GApps user it is even more confusing because the admin page gives me an option to default to higher quality video but that doesn't do anything.

ss3000 · on Nov 3, 2020

This is maddening...

Why does Google, with all the resources at its disposal, choose to cheap out like this when competitors in the video chat space (from tiny startups to gigantic corporation of similar size) have offered near native resolution video chat for ages?

Are they even _trying_ to compete?

k__ · on Nov 2, 2020

I used Meet and Zoom in the last weeks.

Meet was much worse than Zoom, even when I take the bad web interface of Zoom into account.

I ain't a fan of either, though.

izacus · on Nov 2, 2020

Meet certanly rolls out improvements for GSuite before public ones. I think there's even a GSuite setting of "release channel" where you can control how early you get these improvements.

zamalek · on Nov 3, 2020

> 2

I refuse to install Zoom. They have removed the dark pattern, and the "join via browser" option is almost immediately available. If you have it installed, now is a good time to uninstall it.

jtokoph · on Nov 2, 2020

The example video clips in the post look nothing like me and my team's view when using the new feature. Most of the time half of our hair gets blurred or replaced and hand gestures will cause either our hands or head to disappear.

samtheprogram · on Nov 2, 2020

I can vouch for this. I haven’t really needed the background blur feature personally, but I’ve tried it and both myself, colleagues, and friends — pretty much everyone I’ve talked to that has used it — loathe Google Meet’s background blur, and prefer Zoom’s by far.

In my experience, it doesn’t completely cover the background most of the time, and if you move at all, as you point out, it can’t keep up.

Kind of funny to see Google engineering blogging about it when it feels extremely half baked.

This makes me sad, because in all other areas, I think Meet excels well beyond the competition.

EDIT: removed my general sentiment on Google

_pastel · on Nov 2, 2020

"Half baked" misrepresents the difficulty this task. Yes, Zoom does it better, but it's _still_ an excellent and interesting engineering accomplishment.

I've always wondered what proportion of modern real-time video effects rely on ML vs. classical image processing; this not only answers that question, but provides details down to the level of model architecture and the final latency and IOU benchmarks.

Of course I'd be more interested to read how Zoom manages to do even better, but I'm not holding my breath for them to publish those details.

enriquto · on Nov 2, 2020

> I've always wondered what proportion of modern real-time video effects rely on ML vs. classical image processing;

ML is a classical tool for image processing, what do you mean here?

judge2020 · on Nov 2, 2020

"Turns out our training set was comprised entirely of backdrops from Google HQ. Sorry everyone else!"

dontblink · on Nov 2, 2020

Everyone is wfh @ Google

madars · on Nov 2, 2020

At least for background blur the latency there is enough to make it almost unusable: easily over 100ms. This is with latest stable Chrome on a relatively recent Ryzen/Nvidia system. Maybe background replacement will do better once it rolls down to regular Google Meet (too lazy to log into my Google <del>Apps</del> <del>Suite</del> Workspace) :-) However, everything else about Google Meet is great and I wish I could make all my Zoom friends switch.

riffraff · on Nov 2, 2020

> However, everything else about Google Meet is great and I wish I could make all my Zoom friends switch.

is it _better_ than zoom tho? I my experience, I don't see much of an improvement worth switching.

gundmc · on Nov 2, 2020

In my experience, Meet is far far better than the Zoom webclient. And I refuse to install the zoom desktop application.

dhimes · on Nov 2, 2020

I have a pretty modest machine and zoom wins hands-down. It also "just works." I've had trouble getting non-technical people on hangouts/meet/whatever they call it today. Zoom "just works," and they've been responsive to peoples' concerns.

Zoom is the verb now.

qmarchi · on Nov 3, 2020

I've had the complete opposite experience, getting someone to use Meet is easy, send them a link and they open it in their browser. Done.

Zoom it's "Download this, install it, wait for it to come up...", and forget trying to get someone to use the web version.

Disclosure: I work in Google Cloud, but not in Meet.

izacus · on Nov 2, 2020

I think the main benefit is not having to install a desktop app of questionable quality just to make a call.

mr_mitm · on Nov 2, 2020

You can use Zoom in the browser. They "just" discourage it by using a dark pattern. The link for the web client is small and gray and the browser tries to open the desktop app automatically.

You can also join by phone, at least in some circumstances.

izacus · on Nov 2, 2020

Yeah, which makes it a pretty annoying barrier if you want to make an ad-hoc call to someone. Sending a Meet link is more convenient. Plus, Zoom is pretty crippled on the web in the feature department.

amf12 · on Nov 2, 2020

> You can use Zoom in the browser.

You can't change the layout in browser though.

deelowe · on Nov 2, 2020

Try using meet, teams and zoom on high latency and/or low bandwidth connections.

loosescrews · on Nov 2, 2020

Meet seems to work better on poor connections, but it does it with a significantly more CPU intensive codec (VP9?). As a result, it only seems to work well if you have a powerful CPU. If you have a weak CPU, Zoom seems to work much better.

jvdvegt · on Nov 2, 2020

On my 2010 Macbook Pro, Meet uses about half the CPU resources as Zoom. Also, audio-lag is much worse in Zoom.

Background blur doesn't seem available in Firefox on Ubuntu unfortunately :(

MarkyC4 · on Nov 2, 2020

not just Ubuntu, Firefox 82.0.2 on Mac doesn't show the option

tssva · on Nov 2, 2020

My only experience with Meet on a weak CPU is my daughter using it for remote learning on her school supplied Chromebook which uses a Mediatek processor from 2015 which has 2 A-53 and 2 A-72 cores. Meet performs fine on the platform.

gundmc · on Nov 2, 2020

It seems to have gotten a little better recently, but my experience matches yours. It really struggles when I wear over-ear headphones - they sort of phase in and out of existence.

The other thing I've noticed is the background blur absolutely annihilates my CPU. To the point where I would rather just turn off my camera if I don't want my background visible.

cameldrv · on Nov 2, 2020

They have their example video clips, but they also provide data. They say that in their better model, They get an IoU of 93.8% This means 6.2% of pixels are misclassified. Either it's your hair getting cut off or the background is leaking through. 6.2% of an image is a fair bit considering your head is probably 30% of the frame.

chrischen · on Nov 2, 2020

I'm wondering why they didn't just use standard CV techniques like background subtraction? Does their technique work with a dynamic background as well?

objclxt · on Nov 2, 2020

I’ve done some work in this space - subtraction doesn’t perform well when other motion is present, whereas if you use pose / body detection you can ignore other bodies in view (i.e, the toddler running across the room).

neilpanchal · on Nov 2, 2020

Aside: Imagine you’re driving down the road and you need to make a right turn. Well, for some reason the steering wheel is stowed away and disappeared! You need to hover your hand around the center console in a specific area to be able to expose it. Out comes the steering wheel and now you can make a right turn.

Google UX/UI team: Please fucking make the mute/unmute button visible at all times.

systemvoltage · on Nov 2, 2020

Isn't this sort of a Fizz Buzz for a UX/UI design professional? I don't mean to demean anyone, but I see this sort of a thing literally everywhere. Hiding important and absolutely crucial information (that can make or break your product) in the name of minimalism. Coming out of a company that has one of the highest hiring bars for software engineering, and yet, their products have such an awful UX/UI. This isn't an exception, it is a pattern.

atoav · on Nov 2, 2020

I worked as a freelance graphic artist/web designer once and while I wasn't bad at the job, I really hated one aspect of it: Everybody and their kid thought they knew better than I did. When I said: "Yeah but this should really be visible, because accessibility", they would say: " But it looks better if..."

People in high paid position certainly want "has taste" and "knows what looks good" to be part of their self image. Many fails in design and architecture happen for that reason alone.

I then ended up programming and working in film sound, because very few people in both fields tell you what to do when they have no idea what's going on.

robotmay · on Nov 2, 2020

Ah ha, someone with the same experience as me. My degree is in Graphic Design, but I immediately ditched the idea of using it after university and took up programming instead because everyone has a fucking opinion when it comes to design.

canada_dry · on Nov 2, 2020

> took up programming instead because everyone has a fucking opinion when it comes to design

I'm guessing code peer-reviews aren't a think at your org.

sangnoir · on Nov 3, 2020

Imagine a pointy-haired boss, or some rando in Marketing doing your code review (shudders) "That value is a trademarked name of our product - I mean variable - please capitalize it and add a (TM)" I'm glad I don't get noob oversight the way designers do.

spiderfarmer · on Nov 2, 2020

I gave up working for clients as a whole. I don't have a problem with feedback, but I hate people dismissing the expertise they hired me to provide.

emidoots · on Nov 2, 2020

What do you do as a programmer in film sound? How did you get into that? Sounds very niche/interesting

atoav · on Nov 3, 2020

I actually studied film and through my music experience I was always "the sound guy" programming was actually more like a hobby until it turned out I am actually not bad at it.

emidoots · on Nov 4, 2020

I did a fair amount of indie films and know sound guys, so the part I am confused about is: what are you programming in film sound? is it per-film, or like software for film sound in general?

JumpCrisscross · on Nov 2, 2020

> in the name of minimalism

Ironically forgetting that visual minimalism produced by hiding things isn’t really minimalism.

It would be like me throwing all my things in the garage and advertising my house as Spartan. No, it’s not, it’s a mess. The mess is just hidden until I need to do something.

tssva · on Nov 2, 2020

"Hiding important and absolutely crucial information"

If we want to give awards for this my vote would go to Apple. I find their products to be horrific when it comes to completely undiscoverable features. iOS is bad on its own but the Apple TV is a total train wreck. I couldn't get rid of that thing with its awful interface and remote fast enough.

folmar · on Nov 2, 2020

I still vote for the Material Design fad. They've named nondiscoverability a virtue.

Can we go to Windows 3.11 design please?

rapind · on Nov 2, 2020

Like the touch bar...

mcv · on Nov 2, 2020

Exactly. Everybody does this. In anything using video, UI elements apparently need to be hidden as much as possible. In virtual meetings, Youtube, and it's often an option in games.

And sometimes it's great, because you get to focus on the content, and sometimes it's not, because you lose control. It's something that should be optional or configurable. It's great to have shortcuts for the most common commands (like space for pause in youtube), and I guess it would make a lot of sense if video conferencing tools also had such a shortcut for mute/unmute.

But again, give people more control over their UI. There are too many applications that mess this up one way or another.

ajsnigrutin · on Nov 2, 2020

But this has been a solved problem for ages.... just move your mouse a tiny bit, and all the controls are exposed, with large, visible buttons, help text, etc, click whatever you need to, and the controls slowly dissapear, revealing the video.

Having to find the exact spot to hover your mouse is a bad UX

coding_unit_1 · on Nov 2, 2020

I don't disagree, but all the video conferencing tools have a shortcut for mute/unmute. Google Meet included (Cmd+D on a Mac).

magicalhippo · on Nov 2, 2020

But what's the shortcut for the random conference tool the organizers of this meeting decided to use?

Just make it visible. It's supposed to be a tool, function over form.

vorpalhex · on Nov 2, 2020

Which also happens to be the shortcut for bookmarking webpages in most browsers... and Meet doesn't let you rebind this to something sane like spacebar.

m4rtink · on Nov 2, 2020

Not to mention the window might not have focus when you need to use it, making it preatty unreliable.

tuukkah · on Nov 2, 2020

My laptop has a mute button with a status led - an absolute godsend this year.

julianlam · on Nov 2, 2020

Yes, but they're also all different, so you need to play mind games to remember what the shortcut is for which solution.

Global hotkey would be nicer, I think.

hoseja · on Nov 2, 2020

It's the absolute disdain for the user, the aim for lowest possible common denominator.

drevil-v2 · on Nov 2, 2020

> their products have such an awful UX/UI

This is true. I find Android UI so offensive that if I did not have iOS as an alternate I probably would carry a dumb phone and live like a monk. I can’t stand the miles of white space and brightly coloured tiny UI controls.

Evokes such a visceral reaction in me that even I am startled at times haha

mhh__ · on Nov 2, 2020

[flagged]

rwc · on Nov 2, 2020

Autistic?

sundvor · on Nov 2, 2020

Yeah, the post is 4 hours old and still not flagged.. Downvote cast; I personally find that way of speaking offensive.

EDIT: Thanks fellow mods. I have a son on the spectrum.

adwww · on Nov 2, 2020

As a developer I'm a huge fan of Google Cloud. But I'd actually think really hard about chosing them if I started by own business, as the customer service is both expensive and woeful.

mhh__ · on Nov 2, 2020

That and it's google, will [insert service here] exist in 10 years time?

adwww · on Nov 2, 2020

That's very true. Although so far no GCP services that I've come across have been deprecated.

skj · on Nov 2, 2020

https://cloud.google.com/appengine/docs/deprecations

askvictor · on Nov 2, 2020

More important than the button is the status indicator - I need to know if the call is muted or not. Even better, promote it to an OS-level icon/badge/overlay. If my mic is actively in use, please make it blindingly obvious.

captn3m0 · on Nov 2, 2020

The color indicator for mic on/off is terribly for Meet.

nathancahill · on Nov 2, 2020

Absolutely horrible. Both states are equally plausible in the UI.

odiroot · on Nov 2, 2020

I already have it on my Thinkpad with KDE Plasma.

Physical button to block the microphone, LED on the button itself and a tray icon with the microphone status displayed.

wazoox · on Nov 2, 2020

I have it on my IdeaPad with Pop_OS, too. The buttons work as intended and you can easily mute/unmute at any time whatever application has the focus.

neilpanchal · on Nov 2, 2020

It's a toggle switch. The action and its current status is combined.

TeMPOraL · on Nov 2, 2020

And then I have to keep hovering over the icon and guessing from the tooltip whether I'm muted or not. Unfortunately, different software tends to be inconsistent with toggle buttons - sometimes the icon tells you what is, sometimes it tells you what will happen if you click it.

tester34 · on Nov 2, 2020

The only software that gets vide-co right is probably Discord

I used MS Teams and zoom and both are decent (ms teams works fine for school)

but it's insanely unbelievable that this kind of software lacks of features that gaming communities had probably 20 years ago

PUSH TO TALK is probably one of the most important features of any voice software. The lack of it is big WTF.

It gives you 100% control over when you're talking and you don't have to alt-tab between programs in order to "mute" yourself.

You can bind it to e.g MOUSE3 (scroll-push) and it works fine with other programs, games and stuff. Switching between muted/unmuted is different thing.

From somebody who uses/used ventrilo, mumble, teamspeak and nowadays discord for like last 12 years for hours per day, almost everyday.

Orphis · on Nov 2, 2020

For push to talk to work, you need to have access to keys even when you're not in focus.

That's not something doable today on the web for obvious security reasons, but it's possible for Discord that has a separate app, would be doable for Zoom too I guess.

Daniel_sk · on Nov 2, 2020

Zoom has PTT but only if the window is in focus, which kind of kills the actual use case. https://support.zoom.us/hc/en-us/articles/360000510003-Push-...

michaelbuckbee · on Nov 2, 2020

Interesting sidenote: PTT works fairly well on mobile. I'm in a lot of meetings where folks are using their computers for video and "dial in" for audio on mobile so that they can continue working and then PTT on the phone which is now functionally a giant dedicated button for speaking.

tester34 · on Nov 2, 2020

MS Teams, Discord, Skype and all older stuff like Ventrilo, TeamSpeak and Mumble - all were (and are) avaliable on Desktop.

adrr · on Nov 2, 2020

It’s even worse on touch devices. You have to touch the bottom screen to get the controls to appear. Accident touch twice in the wrong location and you can hang up.

Spare_account · on Nov 2, 2020

I've often thought that on a touch screen device the OS should ignore touches on buttons/popups that have been on screen for less time than a human could reasonably have observed it and chosen to interact with it. If I touch the screen 0.05 seconds after a button appears, I was probably _not_ aiming for that button.

In fact, now I think about it, this has happened many times over the years with traditional mouse drive interfaces too.

I'm sure some power users would like to shorten the 'reaction time delay' or even remove it entirely so I guess that should be an option as well.

XorNot · on Nov 2, 2020

Honestly with mouse driven interfaces the rule should be that whatever is popping up on screen absolutely cannot put an interactable control under the current mouse cursor location, and no control should have control focus by default.

There's nothing quite like watching a dialog box go flying by because you hit enter at the exact moment it popped up. What was it? What did it do? We'll never know!

therealcamino · on Nov 2, 2020

My "favorite" instance of that, many years ago, was when the dialog turned out to have said, "Reboot the computer immediately because IT has installed new software." I filed a ticket on that one.

folmar · on Nov 2, 2020

As a "power user" I totally disdain the idea. I know what will appear in programs I use often and touch without waiting for the button to be drawn.

TacoSteemers · on Nov 2, 2020

I have similar problems with Zoom.

The mute/unmute changes position and can be hidden in a top bar that slides out. In some fullscreen situations there is no button to get out of fullscreen. Sometimes double-click works, sometimes it doesn't. Recently I could not even alt-tab away, basically my computer got 'locked' by zoom.

https://tacosteemers.com/articles/2020-10-16-ux-anti-pattern...

jcims · on Nov 2, 2020

I imagine most know this by now but the space bar works as a push to talk button in Zoom (as long as it has focus of course).

I really think there is a market for a physical video conference controller. If I could get a hefty slab of something with quality buttons to enable/disable video, push to talk/mute/unmute, bring to foreground, ‘on air’ light and end call, I’d easily pay $100 for it.

roel_v · on Nov 2, 2020

These exist, e.g. the Elgato Stream Deck. It's basically a keypad with x buttons (there are various versions) that each are small lcd displays that you can program to show and do anything you want (so you can make it do the 'on air' display thing you mention). Its main use case is for streamers to switch between scenes in their streaming software, but I use it for video conferencing (with OBS's virtual camera) to switch between full-screen camera view and desktop sharing, and do stuff like mute/unmute etc.

jcims · on Nov 2, 2020

Whoa the mini is perfect! One in stock at the local geek shop, definitely going to pick it up. Thank you!!!

blep-arsh · on Nov 2, 2020

Is it possible to use it to control a Zoom session without virtualizing the audio/video input devices? Discord has a local API for that but I haven't found a way to control Zoom calls from another app.

roel_v · on Nov 2, 2020

Not sure what you mean by 'control a zoom session', but yes I use it with Zoom. I use OBS to composite video and some audio, I use the OBS virtual camera as the camera device in Zoom, for audio I usually use the straigh microphone stream because it's fiddly to set up (you have to do the mixing outside OBS because OBS doesn't have a virtual audio device).

If you mean that you just want to mute/unmute a zoom session, then also yes - you configure the stream deck to output key press events so you'd program it to output the keyboard shortcuts that you want. Not sure if Zoom has separate mute/unmute shortcuts and if you change settings with the regular keyboard/mouse you might get the display state of the stream deck out of sync with the actual state of the software, that would probably be finicky and/or a lot of work to solve.

I'm still tweaking my setup but using this piece of kit with a good quality webcam, a Blue Yeti mic on an arm, and OBS, being able to control Zoom/MS Teams/Skype in a uniform way, having ultimate control over what part the desktop I share, how I pre-process audio, being able to show my desktop with myself in the corner, ... is already so much better than the clunky default experiences of each of these video conferencing tools. It's like programming with vim - yes I spend an inordinate amount of time 20+ years ago getting proficient with it, but using it just feels like an extension of my brain, like using a Hilti drill hammer vs using a bargain bin Chinese piece of junk.

blep-arsh · on Nov 7, 2020

Thanks for the explanation! Sorry, I got distracted and forgot to write a reply. I don't need the full range of features offered by OBS yet, but I'm strongly considering setting it up just to have control over the video stream. I'm using the Zoom (hah) portable recorder for audio since it offers outstanding audio quality, convenient mic controls and basic signal processing. The problem with controlling apps via keystrokes is exactly what you describe: since the communication is one way, the state of the toggle buttons inevitably gets out of sync. I think maybe using the accessibility API to read the UI state back can help, but I'm not holding my breath.

TacoSteemers · on Nov 2, 2020

> I imagine most know this by now but the space bar works as a push to talk button in Zoom (as long as it has focus of course).

I hadn't checked for shortcuts yet! On my install it turns out to be alt + a.

Spoom · on Nov 2, 2020

These exist, but they're quite a bit more expensive. They're really nice to have though, and one reason I sometimes miss the office.

https://meetingdevices.withgoogle.com/

I bet there is similar hardware the works with Zoom.

Googler, opinions my own.

jcims · on Nov 2, 2020

Yes! I worked at Crittenden Lane about five years ago and really liked the hardware at the time. The whole thing was eye-opening for me, how seamlessly I could meet with folks whether they were in Zurich or on the second floor...I imagine it has only got better since then.

folmar · on Nov 2, 2020

I bet there is a standard protocol H.323 and a lot of hardware available... but then the garden walls would be breached.

mr_mitm · on Nov 2, 2020

Zoom does this well on the iOS app. They call it "safe driving mode" [1] and half your screen essentially becomes the must/unmute button. You can either tap it or swipe left to unmute.

[1] https://support.zoom.us/hc/en-us/articles/201362973-What-Is-...

aaronharnly · on Nov 2, 2020

Every time I go into Present mode in Zoom, it’s panic-inducing. Where the hell did everyone else go? How do I mute? It’s frustrating.

dawnerd · on Nov 2, 2020

Also stop telling me my camera is disabled. Like, I know it is, I disabled it for a reason.

sundvor · on Nov 2, 2020

And stop telling me I'm using an input different than the output. I have a condenser microphone on an audio interface with RTX Voice; no, it's not going to transmit an echo.

baskire · on Nov 2, 2020

COVID taught me that video conferencing is so much more demanding than phone conference bridges.

The best conferencing solutions I’ve used to shame those not using video

vosper · on Nov 2, 2020

https://www.meetenhancementsuite.com/

Not that you should have to install an extension to get basic UX

ehsankia · on Nov 2, 2020

To be fair a lot of sites do need it, especially for more power user level UX. See BetterTTV, RES, etc. Sites generally don't target power users, understandably.

janekm · on Nov 2, 2020

The British PM just had to tell a major media journalist to unmute during the press conference introducing the new quasi-lockdown, so I think we can safely say that Mute button and status is no longer a power-user feature ;)

ehsankia · on Nov 2, 2020

Doesn't Meet, like most other app, have a message when you try to speak muted? Though they should maybe make it more obvious. I do agree that mute button isn't power user.

helpfulgoogler · on Nov 2, 2020

Not what you asked for, but you can mute/unmute quickly with the keyboard shortcut: ⌘/Ctrl + d

tobr · on Nov 2, 2020

Ah, yes, of course. D as in mute.

oars · on Nov 2, 2020

M is taken by minimise, hence the next closest thing is D (for deaf).

kosinus · on Nov 2, 2020

But ⌘D is also 'add bookmark' on Mac, at least. So depending on which UI element inside Chrome is focussed, it may not do what you want either.

I've been using this extension: https://chrome.google.com/webstore/detail/google-meet-push-t...

aaronharnly · on Nov 2, 2020

Nice, thanks. I have a lot of Meets hanging around in my bookmarks thanks to the focus being slightly wrong when I press command-D.

tobr · on Nov 2, 2020

What would be the logic behind “deaf”? That “mute” is a homonym/polyseme of a word for a disability, so let’s just use the first letter of any disability?

randomsearch · on Nov 2, 2020

Would it be less arbitrary to add another modifier to the shortcut, rather than use "D"?

TheChaplain · on Nov 2, 2020

The blessed workaround as we say in software engineering, which allows us to move the real issue waay down the backlog indefinitely. :)

ehsankia · on Nov 2, 2020

And ctrl+e for video

sundvor · on Nov 2, 2020

As long as the Meet browser tab _has focus_.

raverbashing · on Nov 2, 2020

Zoom at least uses Cmd+Shift-A for Audio and V for Video

But as the recent Google Icon kerfuffle, UI/UX is not their strength (probably because of opinionated technical people that think you need to A/B shades of blue)

TeMPOraL · on Nov 2, 2020

Teams uses something equally silly, like Ctrl+Shift+M for mute/unmute, IIRC.

Which is pretty annoying, because the mute button is about the most important button in a videoconferencing tool, and I want to have it under a single keypress, so it can be used effortlessly, with my left hand (the same that operates Alt+Tab, while my right hand is on the mouse, scrolling ... well, meeting agenda, let's say).

I'd fix that for myself with AutoHotkey, but I can't, because Teams is just another Electron app, so I can't just look at which UI component has the focus to create a rule, "if focused on Teams video call and not its chat, rebind M to Ctrl+Shift+M".

One of the countless reasons I hate it when people do custom UI, instead of using OS-provided controls.

lern_too_spel · on Nov 2, 2020

Google's problem is an incompetent product org. An A/B test would have quickly solved UI issues that lead to accidental or underuse of a feature.

tjpnz · on Nov 2, 2020

Speaking of mute/unmute I've not yet found a way to get Google Hangouts (same thing as Meet?) to play nice in situations where simultaneous interpretation is involved. Our company works in Japanese and English and we typically have a second meeting running in parallel for interpretation. This setup almost works, I say almost because I've yet to find a way of muting the audio in one meeting so I can properly listen to the other. I can't leave the first meeting either because often I'll also want to see the presentation slides. Currently I'm working around this by muting my MacBook and joining the second meeting on my phone.

Perhaps I'm missing something obvious (or a Chrome plugin that will allow me to mute based on the page URL rather than site). In the unlikely event that a Googler is reading this I'm not asking for yet another product or complicated new piece of functionality aimed at this specific use case. Just a mute button for audio. Thanks!

lima · on Nov 2, 2020

> Google Hangouts (same thing as Meet?)

No, vastly different products. Hangouts is the legacy thing and never worked quite right for me. Meet is much better.

Wowfunhappy · on Nov 2, 2020

The line between the two is not so obvious. Google was calling their new product "Hangouts Meet" until quite recently.

emartinelli · on Nov 2, 2020

Right click on the tab -> "Mute site".

It works for me for Chromium on Ubuntu.

tjpnz · on Nov 2, 2020

The problem with that is it mutes the whole domain. I still want to listen to one of the meetings.

faitswulff · on Nov 2, 2020

Wow that's strange. FWIW Firefox does not do the same domain-level blocking, only tab-level blocking. And as far as I know, Hangouts still works in Firefox.

zeendo · on Nov 2, 2020

Ohh, that sucks. You could try running separate Firefox instances in different profiles, then. Domain muting shouldn't apply across those.

sundvor · on Nov 2, 2020

A major motivation why I got a StreamDeck was to be able to put a big fat mute button that "physically" kills the microhone level at the source.

It renders a big cross through the microphone when muted.

Simple, yet insanely effective UI (#).

Best thing ever.

#) Especially when compared to the mess that is Google Meet. My favourite "feature" of theirs is how when someone is presenting, it's impossible to view the presentation as just another stream - no they have to make it dominate everything, meaning it's so hard to see the other team members.

And it can be extremely hard to see who's talking when viewing a lot of cameras at the same time. And for whatever reason the quality turns to a blurry mess a far cry from 720p just way too often. (I have fibre internet).

amf12 · on Nov 2, 2020

When did you recently use Meet? I just used it yesterday with a gaming session with friends and the console for the mute / unmute was visible at all times. I even just tried it right now.

himinlomax · on Nov 2, 2020

While you're at it, always display a vu-meter. It gives feedback on what is transmitted and thus can alert a user whether they are being heard or not. It's the most basic of sound recording tools, and was a standard part of recording equipment for over half a century for good reason.

And if you need minimalism, offer a toggle for that. But I think most people should have it forced on them, would save anyone a lot of trouble -- just think about all the aggregate time lost talking into a muted mike by all users.

sundvor · on Nov 2, 2020

If you're on Windows, "Digital Level Meter" is an absolute gem: https://www.darkwooddesigns.co.uk/pc2/meters.html

I did donate a contribution to say thank you.

leeoniya · on Nov 2, 2020

https://en.m.wikipedia.org/wiki/Mystery_meat_navigation

nurettin · on Nov 2, 2020

We are in the era of three seashells. There is no turning back from this. Soon you won't be able to find the power button for anything tech industry related.

vagrantJin · on Nov 2, 2020

I looked like an idiot trying to find a power button for a meeting a few weeks back. Literally stood there feeling up the TV for a good few minutes.

I'm a 28 yr old software developer.

teddyh · on Nov 2, 2020

“Off switches are illegal.”

rplnt · on Nov 2, 2020

> Out comes the steering wheel and now you can make a right turn.

But you will hit a dog probably, because the steering wheel suddenly blocks your view too.

three_seagrass · on Nov 2, 2020

There are so many unnecessary clicks in the Google Meet U.I.

When I leave a meeting, can you please stop asking me for feedback every time and just take me back to the main meet screen?

It would be so easy just to put that small dialogue box on the main meet screen rather than prompt me to click the button to return.

folmar · on Nov 3, 2020

I guess not a lot of people use the main screen at all. It only hass a minicalendar and you probably have another calendar application anyway.

wdr1 · on Nov 2, 2020

Cmd-D will mute/unmute. I find it much easier than using the mouse.

chedabob · on Nov 2, 2020

I find more often than not I end up bookmarking the page with that shortcut.

MarkyC4 · on Nov 2, 2020

Also: Consider another shortcut for mute/unmute (cmd + d or ctrl + d is the bookmark shortcut in like every browser... maybe this is intentional?)

chedabob · on Nov 2, 2020

Also don't put that damn bar over the subtitles.

howlgarnish · on Nov 2, 2020

Press Ctrl-D to mute/unmute.

Doesn't excuse the UI, but at least this lets you avoid using it!

nobleach · on Nov 2, 2020

Which is so odd as CTRL-D is also the bookmark shortcut in Google Chrome. So, say for example, my team has a goto channel where we have our ad-hoc meetings. It's a pain to bookmark it for later use without jumping through the gui.

gogopuppygogo · on Nov 2, 2020

Zoom has the same issue.

I bought an external microphone for my laptop with a hardware mute button.

on_and_off · on Nov 2, 2020

unpopular opinion it seems : in personals 1:1 calls, I love seeing the UI disappear and just have a full screen video of my SO/family member.

eugmill · on Nov 2, 2020

cmd+D will mute/unmute (ctrl+D on windows?)

I still can't stand the bottom popping up and down and not being able to tell if I'm muted.

Angostura · on Nov 2, 2020

MS Teams has finally changed this on their video calls. Ah the hours I spent telling colleagues 'If you move your mouse around, you should see a black bar appear somewhere near the middle llof the screen'.

swiley · on Nov 2, 2020

At this point the bad UI in google products feels intentional. You're supposed to feel helpless on computers and just do whatever they want you to do.

sillysaurusx · on Nov 2, 2020

Happy to see ML become mainstream. In the future, I don't think ML will be a separate field of programming. It'll just be "programming," the same way webdev is.

There's a tendency to think of ML as "not programming," or something other than just plain programming. But as the tooling matures, that'll go away.

(Lisp used to be considered "AI programming," till it became useful in many other contexts.)

sltEvas · on Nov 2, 2020

ML will become a library. It has about as much to do with programming as a compiler. You don't need to know what it does, you just need to know how to make it do things. The problem with ML currently is that nobody really knows how to do things and that you have a million parameters that need tuning and most algorithms need continuous improvement and fine tuning to the use case. There is nothing "mainstream" about ML at this point, except that everyone wants to use it.

In maybe a decade, it might be found in standard libraries of programming languages and on top of things like `Math.abs`, we will have `ML.textToSpeech("Hello world")`, or `ML.isCat(image)`, etc. However, the problem I see with that is that no matter how far we wind the clock forward, we will only be able to put the most simplistic use cases into a library. `ML.isCat()` could be one of those, since most humans will be able to image categorization, it stands to reason that you could put this into a library. However, most industry application involved highly customized ML algorithms that are optimized for a very specific use-case. So there will always be a need for a research team in big companies at least. Maybe smaller companies will try to build their stuff by chaining libraries together.

virgilp · on Nov 2, 2020

There's never going to be a `ML.isCat(image)`, just like there isn't a `Math.solveProblem(hypothesis)`. Yes you do have `Math.abs` and you're going to have stuff `model.fit()` and `layers.dense()` - but something like `ML.isCat` is too specific to be used in a library

sillysaurusx · on Nov 2, 2020

Disagree. In the future, that'll be `npm install ml-cat` followed by `MLCat = require('ml-cat'); MLCat.isCat(image)`

It might not be npm, but something like that is probably inevitable.

The reason it seems so unlikely is because the tooling isn't there yet. No one even agrees how ML code should look, let alone how libs should be distributed to end users. But I saw the transformation for JS in 2008.

virgilp · on Nov 2, 2020

the range of problems you can solve with ML/AI is simply too wide for there to be fully-canned solutions for everything. Sure, there will be canned solutions for _some_ things - maybe even for cat detection, because it's fun so why not.

But, a library that uses AI to optimize the production of your business' flux capacitors? Ain't gonna happen, you need to build that yourself. To have a library/product that solves problems using AI, you need a "language" to describe the problem (like you can e.g. use SQL to describe any data query you may have). But describing problems is notoriously hard - accurately & precisely describing the problem is very often just as hard as solving it.

sillysaurusx · on Nov 2, 2020

Mm, it's a bit like arguing that "the range of text editor customization is simply too wide for there to be fully-canned solutions for everything." Meanwhile, elisp wiki go brr.

I think ML solutions will increasingly take the form of an elisp script rather than a python library, but it'll take a little while to get there.

virgilp · on Nov 2, 2020

> it's a bit like arguing that "the range of text editor customization is simply too wide for there to be fully-canned solutions for everything."

But the the range of editor customization really isn't that wide. That's exactly what I'm arguing, that ML/AI is more like "math" than like "editor customization".

csande17 · on Nov 2, 2020

Apple begs to differ: https://developer.apple.com/documentation/vision/vnanimalide...

fhaodhdms · on Nov 2, 2020

Fwiw macs have had an equivalent functionality for both text to speech and speech to text for at least 17 years to my memory. The quality is poor compared to today's server-driven approaches, of course, but the functionality has been there if you're willing to articulate yourself clearly.

m00x · on Nov 2, 2020

AI is learning existing patterns from input/outputs. Programming is setting up patterns to turn your inputs into desired outputs. Most often it's just plumbing data around with some transformations.

What you're talking about is using AI as programming tools. It's still programming, but using pre-trained models as part of the plumbing.

ilikeerp · on Nov 2, 2020

[flagged]

andeee23 · on Nov 2, 2020

Do you consider an app like Microsoft Teams not a "real" program then?

throwaways885 · on Nov 2, 2020

Arguably there is a distinction between "web dev" and "desktop dev with web platform technologies".

ilikeerp · on Nov 3, 2020

Much you like yourself, your questions is retarded. Teams is not web dev you moron.

kerng · on Nov 2, 2020

Interesting that this post made it to #1. It seems like Google marketing trick.

Anyone who uses the blue realizes that it's far lacking in quality from other offerings and Google Meet UI is very bad also.

Zoom, Teams, even WebEx are superior quality and usability wise.

lima · on Nov 2, 2020

Curious to hear why? Google Meet is the only web-based videoconferencing product that works well for us without audio dropouts or random issues.

Zoom's web client is particularly terrible, and we can't install the desktop client for security reasons.

And the new background noise cancellation feature is magic.

tortasaur · on Nov 2, 2020

We haven't run into issues with a self-hosted instance of Jitsi Meet. Might be worth a look.

lima · on Nov 2, 2020

We used to use Jitsi Meet and it worked perfectly for our team meetings, but we kept having issues with 10+ participants, overseas meetings with 100ms+ latency, and whenever Firefox was used. YMMV, one year ago.

raybb · on Nov 7, 2020

Is your self hosted instance much higher quality than the public instance?

rplnt · on Nov 2, 2020

Google Meet UX is the worst. Except WebEx which is somehow even worse. But WebEx is at least "complicated" and bad, Meet is just bad.

Out of these I'm really surprised how "not as horrible" MS Teams are. Loads of functionality and the UX is bearable.

tantalor · on Nov 2, 2020

Can you elaborate your criticism? Meet seems fine to me.

nexuist · on Nov 2, 2020

My team does weekly Google Meet meetings along with WebEx. The biggest complaint I'd have is that Meet sacrifices functionality for cleanliness; everything useful is hidden behind some menu or popover, and you can only open one popover at a time (otherwise whatever you had open closes). This contrasts widely with WebEx, where most things (participants, controls, chat) can be shown at the same time, but also hidden if you don't want to see them. Meet seems complicated in comparison because views that are 0 clicks away on WebEx require 1-2 in Meet.

rplnt · on Nov 2, 2020

Basically what other comments suggested. The popup menu that shows every time I move with the cursor and covers part of the screen. Can't be hidden on demand. Shows status I can't see without opening it (and covering part of the screen). Can't change my mute status without opening it (and covering part of the screen) or using a very obscure shortcut.

rossjudson · on Nov 2, 2020

I am confused. The microphone button is on the bottom bar, is clearly available at all times, and can always be clicked-on. You are using Google Meet within a browser?

rplnt · on Nov 3, 2020

The pop-up bottom bar. Yes, in a browser.

sundvor · on Nov 2, 2020

I am going to admit that Nvidia Broadcast looks absolutely amazing to me. It's likely to be the reason why my next GPU won't be AMD's new, even though it appears to deliver much more bang for the buck.

I already have RTX Voice now and it's the best thing ever.

https://www.nvidia.com/en-au/geforce/news/nvidia-broadcast-a...

thinkloop · on Nov 2, 2020

> Zoom, Teams, even WebEx are superior quality

Are they able to change the bg in the browser?

obilgic · on Nov 2, 2020

You all notice that this is a PR piece to get tech people interested in using google meet instead of zoom right.

toper-centage · on Nov 2, 2020

No, because tech people want software that works, has good UX etc. This is a PR piece for people that prefer software that has cutsie little backgrounds.

loosescrews · on Nov 2, 2020

Too bad it doesn't seem to be supported in Firefox.

spurgu · on Nov 2, 2020

Not yet at least.

Jitsi also has background blur but it's only ok-ish on Chrome and unusably slow on Firefox.

mike_kamau · on Nov 2, 2020

Why are people replacing their backgrounds?

I thought the whole point of having a video call is to see who you are talking to, and their environment to further enhance the effectiveness of the conversation.

If you are in your kitchen, or under a tree, I definitely would like to see that because that environment will have an effect on how we communicate.

gerbler · on Nov 2, 2020

Sometimes people may not be comfortable sharing their backgrounds, and may not have convenient alternatives. For example, if you have a bed in the background it can be awkward and you might want to blur that out.

adwww · on Nov 2, 2020

I don't bother, but then I live in my own home and my background is an empty study.

I have coworkers who are in house shares with 5 other adults all trying to work from home around tiny desks. Background blur for them is a nice way to hide some of the chaos of their living arrangements.

spurgu · on Nov 2, 2020

If the apartment is a mess in general. Table full of empty cans of beer. A dildo on a chair. Your wife randomly walking by in her underwear (not sure whether this would be unblurred?).

In the above scenarios, if I'm not certain there aren't going to be ackward things in behind me, I'd want to blur or set a custom background. Back against a wall also works which is what a lot of people seem to be doing.

0xffff2 · on Nov 2, 2020

Why not just turn off your camera? The blurring tech doesn't seem nearly reliable enough for me to trust it if my "office" was that much of a catastrophe.

Spooks · on Nov 2, 2020

I think because seeing someone's face is more important than seeing their background

spurgu · on Nov 2, 2020

Yeah this should be obvious. I think video calls are a waste of time the majority of the time but one legitimate use case is where there's an issue which doesn't seem to be easily resolved using written media. In this case it's useful to have a video call where you can gauge someone's reaction to specific things you say. That way you might be able to get to the gist of where the miscommunication is happening. A dildo in the background doesn't add to this (although a bunch of empty vodka bottles might gives some clues), while seeing a person's reactions to statements/questions might.

hrktb · on Nov 2, 2020

I don’t understand this part:

> In the current version, model inference is executed on the client’s CPU for low power consumption and widest device coverage.

Naively I would think model inference done server side would have the lower CPU power (from the client point of view) and widest device coverage (client does nothing more), what am I missing ?

jonex · on Nov 2, 2020

It is done on the CPU instead of the GPU. GPU would seem like the natural choice for a convolution heavy model but was not used here for the mentioned reasons.