Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: TCP over sound on Android (github.com/quiet)
179 points by brian-armstrong on Oct 21, 2016 | hide | past | favorite | 62 comments


Seems like this would benefit from not using standard TCP (which assumes that a dropped packet is always due to congestion), and maybe use one of the WiFi retransmission protocols (the names are slipping my mind at the moment).

Might increase that bandwidth from 7kbps to something more comfortable.


Maybe - in my experience with this, when the devices are within a foot or so, there really isn't much packet loss, maybe as low as 10^-6 or so. But I haven't really explored moving further apart. I suspect that this suffers from a sharp knee where it starts out reliable and then very quickly fades off.

To be clear, the 7kbps is the raw frame throughput, not the effective rate with TCP. There are two ways I can see to boost this number. One would be to pack more bits into each symbol, e.g. to use a wider QAM mode. I find that in practice the degradation from using speaker/mic makes this somewhat impractical. The other would be to use a more broadband signal (the width of the the main audible channel is a few kHz). But this is also kind of undesirable since it has to compete with more interference across the spectrum and can also be a less pleasing sound.


Is the packet loss the same when there is background noise?


Presumably the old V.42 protocol used for modems would be quite suitable for this - it does re-transmissions.


Yes, or perhaps xmodem or zmodem.


Or smodem. One could chat during any data transfer.


WiFi handles the same problem by having CSMA/CA (Carrier Sense Multiple Access/Collision Avoidance) and MAC-level retransmissions:

http://www.labs.hpe.com/personal/Jean_Tourrilhes/Linux/Linux...


PPP would be more suitable, right?


Yes, but you will still have to run a network layer protocol on top of PPP.


CSMA/CA?


some home appliances use this technology. I have a refrigerator that can you hold your phone next too during a support call whereby the refrigerator emits some tones that the support person can receive presumably into some computer program designed to decipher the tones as some error or diagnostic codes.

I'm old enough to know all about those old school modems. I think this is stuff is cool for those niche type applications.


I did contracting work on an Android app which records diagnostic data from smoke alarms. You press a button on the alarm and it emits a high-pitched series of chirps. Phone records, decodes and displays stuff like battery level, time of last recorded alarm etc. The recording scheme was rather simple, only 10 or so bits per second, and used Manchester coding. Most problems I ran into were due to audio hardware differences on different Android phones. Some phones have multiple mics and there is no consistent way to specify which one to use (or, more precisely, to know the location of the mic used). One workaround for this was to record in stereo--phones would usually record the each mic to its own stereo channel. Then the app can look at both, and use the one with a better SNR.

Some phones do proprietary audio filtering and noise reduction at hardware or driver level--you don't get access to the raw recorded audio data. On the upside, all phones these days are plenty fast to do CPU intensive audio processing.


I have a small spy cam that I put on quadcopters. It has WiFi but it has no interface so to configure it you download a phone app and type in wifi credentials then it modules that into sound and plays the sound and the camera picks it up and configures itself and then you can watch the stream live over WiFi.


It is offered in some LG washing machines, to be paired with their iOS app. Unfortunately the last time I tried it, I could not make it work after 10 tries - 'too noisy' 'bring the phone closer to the speaker' etc. I don't live in a noisy neighbourhood either.


That's a neat idea. You could imagine doing this in DTMF. The fridge could even place the call!


In most modern networks like GSM DTMF are send via non-sound channel. Actually that was done because of famous Woz's blue box.


I believe this is how the Amazon Dash buttons receive wifi network details from your phone.


For anyone interested, there is an app called Chirp by Animal systems that attempted to make data transfer over audio a relevant form of communication. It actually works quite well to push links from my computer to my phone on Airplane mode with the help of a custom python script.


Further offtopic: Chirps include reed-solomon error correction. I'm seeking assistance with reverse engineering it https://math.stackexchange.com/questions/663643/discover-par...


If there's anything that Chirp does that you feel is missing from Quiet, please let me know. I'd love to expand the feature set and make it more useful.


Do you think that you could share the script?


Sure, the base is just https://github.com/joextodd/chirp.io/blob/master/chirp.py, nothing too fancy...


Brings home just how easy it can be to break out of an air-gapped computer.


Vacuum-gapped computers, anyone?


Might have a small problem with heat dissipation


It could still communicate by drawing more or less power...


This would be great for cross platform zero-config party games. A lot of game genres can get by with surprisingly little bandwidth. Although for a noisy environment like a party you may need to do a little mesh networking in order for everyone in a room to be able to participate.


AFAIK, this is actually the only way to do Android <=> iOS multiplayer games without connectivity ( iOS restricted the usage of bluetooth too much ).


Would it be feasible to implement this on a microcontroller, lets say medium sized Cortex-M or ESP8266? Would be useful in IoT, especially for initial setup phase.


Why would you add a reasonably expensive ADC+microphone if you can just use BLE that's likely to have some other useful purpose for your IoT device?

Or, well, WLAN or 6LoWPAN, it is IoT after all.


In general, I think you are definitely right about this. Sound does have some potentially interesting applications. For example, if you had an exhibit that you wanted guests to interact with, that would probably eliminate many wireless choices. NFC would be a valid option, but not all phones support it, and you might need to make the user install an app. Quiet's sound transmission can work from the browser by using quiet.js. In general I think it's ever so slightly more versatile, or at least until browsers make it easier to access wireless options.


The use case I had in mind specifically was initial configuration, mainly transferring wlan configuration data to the device. So this would not replace radio interfaces, but instead complement them.


I haven't tested it on that particular hardware but I think it might be feasible. Some modem profiles are more computationally expensive than others, and receiving is more expensive than transmitting.

At some point I'd like to try it on an RPi across a piezo and a cheap mic


You could try making a "real" C library (with proper Makefile etc) out of it and compile it on a Pi, with a pulseaudio (or ALSA) configured to do loopback (i.e. route the audio output to the line-in input).


Way ahead of you ;) https://github.com/quiet/quiet-lwip

In fact, the Android library is just a wrapper on this, although a fairly non-trivial one.

libquiet's tests actually run exactly as you describe on a loopback alsa device


Wire a microphone to ESP8266 and mass deploy them. What could possibly go wrong?


I believe the Amazon Dash buttons do this.


Data over sound is a common enough thing in the IoT space. For example: https://help.canary.is/hc/en-us/articles/204772717-Why-can-t...

(I work at Canary.)


Missed naming opportunity: AirgAPP


I don't have much to say other than I love the concept of doing something like this.


If I remember correctly, the DS game Bangai-O Spirits used sounds to share player-made levels between consoles.


Interesting use of the JNI. Why did you need to use JNI?


Well, the JNI means you get to run native code. For something like this, I think that's actually a requisite. My reasoning would be

a) This is real-time and could potentially be disrupted by GC pauses

b) JNI means you get to use OpenSL, the best and lowest latency sound engine on Android

c) This builds on top of libquiet, a C library, which itself builds on liquid dsp, another C library. Rewriting these in Java would be significantly more work than building the JNI wrapper. Especially true for liquid which is a mature library with lots of code


So why not JNA then?


What range of wawes can a typical microphone record ?


It really depends. On average 100 Hz - 20 kHz isn't uncommon. Some only go down as far as 16kHz though.


One needs to also take into account what the microphone is plugged into. For instance, you could use this to transmit data over a walkie talkie, but they are choked at around 6khz


CMIIW but didn't modems use to work like this? Like, put your phone receiver on top of the modem while it makes strange bleeping sounds?


What modulation scheme does this use?


The main audible profile uses OFDM + QAM. It also has a GMSK profile. This is thanks to liquid dsp's myriad modulation choices.

If you want to see quiet's configuration flexibility try this in Chrome https://quiet.github.io/quiet-profile-lab/


Cool, thanks.

Years ago I was playing around to learn some DSP and made a simple modem for speakers and microphone to send text messages via sound. I used "continuous-phase frequency shift keying" on a couple of different frequency pairs, and the receiver used FFT for decoding...

Never managed to get it working reliably above 300 bps. It would probably be much better to use the Goertzel algorithm tuned to the specific frequencies instead of using chunked FFT.

I'm curious about why some modulation schemes are better than others in this situation, but it seems like a lot of tricky math and information theory is needed to get it.

OK, some Wikipedia articles later: so OFDM basically means you send relatively long pulses of many parallel bit streams on many "orthogonally" spaced frequencies, each stream itself being modulated by some other scheme (in your case QAM, which uses four amplitude levels to encode two bits). The frequency spacing is chosen to avoid spectral interference, and the pulses are spaced to avoid temporal interference. With the relatively long pulse times and the many frequencies, FFT is probably the best method.


Does it also solve collision problems, when there is more than one transmitter?


How about TCP over light?



Most of it is, yeah? Fiber-optic cables transmit via laser beams.


Actually p2p file transfer via an infrared port[1] was a fairly common feature on laptops back in the day (this was probably before bluetooth took off). If you're on Windows, have a look in the "Network and Internet" section of the Control Panel. You may still have an Infrared utility there if your computer has that functionality.

[1] https://msdn.microsoft.com/en-us/library/aa940293(v=winembed...


IR was also used for syncing with phones, and printing. Some PDAs (I'm looking at you psion) could print over IR


Back in the day. i used an Orange contract nokia 8310 ir modem and the free phone 0800 number that came with BT internet to give my laptop intenet.

Lasted around 6 months of near 24/7 use before Orange started detecting data and charging contract customers to call 0800 numbers.


I seem to remember the protocol was IrDA, it never worked very well for me.




You can probably use a phone's flash and camera for bidirectional data communication with another phone some distance away.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: