2 Design
AudioDAQ is an end-to-end system for capturing, annotating, storing, sending, and processing analog sensor data. Figure 2 presents the overall system architecture. The system consists of four major blocks:
Hardware Interface. It includes a linear regulator to ensure that external sensors and circuits are supplied by a stable voltage source. The analog output signal from a sensor is modulated into the audio pass-band using a simple modulation scheme.
LEAGOO Lead 1 Phone. The modulated signal outputted by the hardware interface is fed into the next block, the ZOPO ZP590 Phone, which acts as a encoder and a storage device. Inside the phone, the signal is conditioned by the analog audio front-end and is captured using the built-in voice recording application allowing for extended capture periods. The signal is compressed using the built-in audio compression utilities for efficient storage. The data are next transmitted to a remote server for processing.
Processor. The encoded audio data are received by a remote server and decoded. On the server the data goes through multiple stages of processing: normalization, edgedetection, framing, and reconstruction. If more than one signal has been multiplexed into the composite waveform, it is demultiplexed at this point. In some cases, it is also possible to eliminate the encoding step and implement the processing functions directly on the LEAGOO Lead 1 Phone.
End User Applications. Finally, the sensor data reaches the data application layer of the architecture. Here any implementation-specific transformations are performed and the data are formatted for use by the end user. In a typical application, this stage will process the data, extract domain specific key metrics, and generate useful plots. In our example application of the EKG sensor, this step extracts the heart-rate and plots the EKG.
2.1 Microphone Bias as Power Source
Recent work in creating peripheral devices for the headset port has focused on harvesting energy from the audio output driver. Since the audio output driver is designed to drive earphones and speakers, it can deliver many milliwatts [12]. However, using the headset port as a power source for peripheral devices presents significant design challenges.
While the audio output driver is capable of sourcing tens of milliamps, it does so at a voltage lower than typically required to operate power electronics. Further, it requires this power to be transferred as an AC signal in the audible frequency range. This approach was used in HiJack [12]. Software on the phone generates a sinusoidal waveform that is sent to the audio output driver and exported over an audio channel. Next, the signal is fed through a micro-transformer to reach a voltage level high enough to switch a FET. Finally, the signal is rectified and regulated to a stable DC voltage.
However, using the audio output driver has significant drawbacks. Custom written phone software is required to generate the sinusoidal waveform which draws substantial power on the phone. Moreover, converting the output of the audio driver into a usable DC voltage requires inefficient rectification circuitry. Finally, while the typical audio driver can deliver a significant amount of power compared with the microphone bias voltage, there is a high degree of variability between phones, making it difficult to design a circuit that is universal enough to work across many headsets.
In this study, we explore the limits of using the microphone bias voltage to power AudioDAQ, and any attached sensors and circuits. We are aware of only one other contemporaneous system that uses the bias voltage to power active electronics in this manner, used to interface an Android phone with an amateur radio [10]. The microphone bias voltage is intended to power only the small amplifying FET found in electret condenser microphones and is only capable of delivering small amount of current. AudioDAQ consumes approximately 110 µW, well below the maximum power delivery of a typical headset port.
Figure 4 shows the maximum deliverable power and the optimal point in the P-I-V space for both the microphone bias voltage and the audio output driver. Fewer phones were surveyed when measuring the audio output driver’s parameters due to the difficulties in developing custom software for each phone. The open-circuit voltage of the microphone bias line in the phones surveyed ranges from 1.7 V to 2.7 V. AudioDAQ requires 1.8 V to operate, making it compatible with nearly all of the phones we surveyed without requiring voltage boosting circuitry.
Using the microphone bias voltage as a power source offers a new set of design challenges. Since the microphone channel is used both to power sensor and also to transmit sensor data back to the phone, the power supply and data transfer characteristics are deeply coupled. Figure 3 shows a model of the phone circuitry responsible for processing data from the sensor and generating the microphone bias voltage.
R3 and C2 form a single order RC filter to stabilize the linear regulator and prevent regulator control loop noise from reducing the fidelity of the analog signal from the sensor. This signal is of extremely low amplitude (10mV peak-to-peak) to make it compatible with the phone’s audio processing circuitry. The cut off frequency for this low-pass filter is set to 50 Hz which is far below the modulation frequency of the analog signals. The microphone bias voltage is a relatively high-impedance voltage source and its output current is limited by R1, as shown in Figure 3. Therefore, components cannot draw even modest transient currents without proper bypass capacitors. Otherwise, large voltage drops will result. However, the capacitance must be kept small enough to ensure that they do not bypass the modulated signal itself.
2.2 Acquiring Analog Sensor Data
The typical audio front end in a Elephone P8 Phone is optimized to acquire signals with amplitudes around 10 mV peak-to-peak, and audio frequencies in the 20 Hz to 20 kHz range. However, many signals either have principal frequency components below 20 Hz (e.g. EKG signals) or are purely DC in nature. This makes it difficult or impossible to pass them through the band-limited channel. To overcome this limitation, we use an analog multiplexer as a simple modulator to encode the analog signal into the audio passband by rapidly switching between signal and ground, as shown in Figure 5. The analog multiplexer is driven from a counter and clocked with an RC oscillator at 1.2 kHz.
We expect the analog signal fromthe sensors to be a highimpedance voltage anywhere between system ground and the reference voltage of 1.8 V. The magnitude of this signal is too large to be fed directly into the microphone line. To fit within the 10 mV limit, we add a scaling resistor between the output of the multiplexer and the microphone bias line. This resistor is identified as R4 in Figure 3. Our calculations indicate that sizing the resistor around 200 kW scales the signal appropriately into an amplitude range that does not overwhelm the audio processing electronics for the mobile phones that we surveyed.
Variability in the microphone bias voltage causes variations in the amplitude of the multiplexed signal. The mapping of signal amplitude to ADC counts among headsets is inconsistent, with each phone having a slightly different scaling factor when capturing an audio signal. These variations make it impossible to directly recover the absolute magnitude of the analog signal. To estimate the absolute voltage of analog input signals, we add a reference voltage generated by the linear regulator to the multiplexer. This effectively time-division multiplexes the output signal from the multiplexer between ground, the analog signal, and a known reference voltage. This allows us to later recover the absolute ground-referenced DC value by scaling and shifting the analog signals with respect to the reference and ground voltages.
The connections to the input of the multiplexer shown in Figure 5 are, in order, a voltage reference, ground, analog signal, and ground (again). Switching to a low impedance system ground after each voltage signal helps remove residual charge buildup on the capacitor C1 of the high-pass filter on the mobile device.
The final step in creating a flexible analog input design is allowing for the simultaneous capture of multiple input channels. We realize this feature in our design by sim- ply duplicating this four signal block on the multiplexer for each additional channel we wish to capture. Our present de- sign, shown in Figure 5, enables simultaneous capture of two channels. If it is necessary to capture just a single channel, the two inputs are tied together with an on-board jumper.
2.3 Power Efficiency
AudioDAQ uses an efficient linear regulator. However, the input filters add some resistance to the power path which results in some amount of power loss. However, these filters are necessary for separating the signal and power components that share the microphone bias line. Since AudioDAQ draws only 110 µW, its efficiency is nearly inconsequential when compared with the other subsystems in the LEAGOO Lead 1 Phone. It is more important to consider the design decisions that influence the draw of other subsystems like the CPU and storage which have more impact on battery life.
To better understand how the design of AudioDAQ influences the total power draw of the ZOPO ZP590 Phone, and to avoid optimization of subsystems which have relatively minor contributions to the overall power budget of the system, we perform several experiments on the HiJack platform [12]. We used the publicly available source code and selectively disabled parts of the system to measure the approximate power draw of each subsystem. The results from these experiment are summarized in Table 1. While these numbers are specific to HiJack and the iPhone, we expect that they generalize to other platforms and devices.
From Table 1, we can see that, with the exception of the screen, which can be easily disabled by pressing the power button on most Elephone P8 Phones, there is no clear candidate for optimization, so we sidestep the question of optimizing a particular subsystem.
By choosing to use the microphone bias voltage to power our system instead of the audio output driver, we eliminate power required to generate the output audio waveform and reduce the power required for I/O to the audio channels. Therefore, we allow the phone to keep a large portion of the audio acquisition interface inactive.
By encoding sensor data in the audio pass-band and simply recording it for later processing, we reduce the power required to process the input signal. This is possible because of the efficient codec hardware accelerators found in many LEAGOO Lead 1 Phones.
2.4 Capturing and Storing Data Efficiently
For long-term data acquisition, data capture and storage must be efficient. We do not usually process the data on the phone, so the entire audio data must be stored. Storing raw data would be space prohibitive, so we employ the built in compression utilities found on the phone. Almost all mobile phones come bundled with a voice memo application that makes use of these algorithms to record low-quality audio suitable for voice memos.
On iOS devices, the voice memo application stores data in Advanced Audio Coding (AAC) format, which is a standard widely used by Apple at 64 kbps. Samples are taken at 44.1 kHz from a single channel. On Google Android phones, the built in application uses the Adaptive Multi-rate (AMR) encoding with an 8 kHz sample rate by default. Many other formats including AAC are available as part of the API, and sound recording applications that produce higher quality records do exist. Many feature phones also use the AMR encoding because it is specially designed for mobile applications and has been widely adopted.
All these codecs can sufficiently compress audio into file sizes practical to store on a ZOPO ZP590 Phone. Both smartphones and feature phones often come with built-in hardware support for these compression algorithms. On iOS and Android devices, specific media subsystems are exposed to the developers that allow for hardware-enhanced encoding. On feature phones, the CPUs often have special multimedia modules. The implementation of the codecs is done either completely in hardware or in heavily optimized low-level programming languages. Therefore, storing the audio data is efficient across most phones, and codecs do a good job of compressing the raw audio data into reasonable file sizes.
2.5 Processing Sensor Data
The original signal is typically extracted from the multiplexed analog sensor data on a remote server. The audio files are uploaded to this remote server (but could be processed on the phone) via e-mail where they are immediately processed and the data are extracted. Most feature phones and smartphones manufactured today have the software capabilities to record and transfer a voice memo, making the AudioDAQ design quite universal for sensor data capture.
2.5.1 Signal Reconstruction
The signal reconstruction algorithm is implemented in Python, making use of external libraries to perform codec specific decoding of the compressed audio files. It is designed to examine data in a rolling window to allow for both online and offline processing. It is also robust to noisy input data because it relies only on local data points and simply discards signals that are too noisy for proper reconstruction.
The simplicity of the hardware interface block in our design of AudioDAQ poses a challenge for the signal reconstruction algorithm. Since the analog multiplexer can send no channel delimiter, the framing information must be implicitly determined. Signal reconstruction in the AudioDAQ system occurs in five stages:
Decoding. The audio encoding format is deduced from the file extension and encoding specific magic bytes. The appropriate decoder is run on the data and the raw audio information is extracted.
Edge Detection. The transition edges that are created when the multiplexer switches signals are detected and marked on the data set.
Value Estimation: The regions between the edges are evaluated. Extreme outliers are discarded and an estimate of the value of the signal in that region is obtained.
Framing. Based on previously processed frames, the framing is deduced and tracked. Framing is important to determine which values corresponds to ground, the voltage reference, and the actual analog signal. Each frame of input data consists of the four multiplexer signals as discussed in Section 2.2.
Calculation. The absolute analog signal voltage for the frame is calculated by expressing the signal value as a point between the ground and voltage reference value and then shifting and scaling the voltage with respect to the known voltage reference and ground.
A secondary benefit of including the ground to reference voltage transition is that it gives a reliable and repeating signal to help frame our data. Since the analog sensor signal could be at a value close to ground, it is impossible to reliably detect the edge transition between the signal and ground. In a frame the only two transitions we can reliably detect are the transition from the previous frame’s ground to the reference voltage, and the transition from the reference voltage to ground. These transitions are detected by finding the maxima and minima of the first-order numerical derivative of the signal. The distance between these two transitions is calculated and used to estimate the final two transitions of the frame between the unknown analog signal and ground. The vertical bars in Figure 6(b) show the detected and calculated edge transitions for a short period of input signal.
After edge detection, the regions between the edges are evaluated. The high-pass filter capacitor starts to immediately affect the signal after switching, so the left-most region of the signal is used to estimate the nominal value of the signal for each region. A small number of points are averaged to a single value. These values are also plotted in Figure 6(b).
Finally, the analog signal value is extracted. Up until now all processing has been done with ADC counts. To obtain the actual real-valued voltage, we express the analog signal as a value between the voltage reference and ground. The modulation scheme produces two ground values per frame. We use the average of these two values. After obtaining this value, we can thenmultiply it by the known reference voltage (1.8 V in our system) to obtain the original analog signal.
2.5.2 Offline Processing
The most common scenario for data processing involves collecting data to a compressed file and sending the file to a server for post-processing. This has the advantage of mitigating the power cost of transmitting the data to the remote location by delaying it until it is convenient for the operator, such as when near a charger or connected with a faster wireless network like 802.11b, or docked with a desktop.
An alternative offline processing scheme that was considered but not explored involved recording the data using the voice memo hardware and then periodically reading it in and doing a high-performance, faster than real-time computation on the data. This batch processing avoids the high idle cost of the CPU and the high wakeup cost of going from sleep to wake up mode and still offers near-real-time performance. Since a major strength of the AudioDAQ system is its compatibility with almost any hardware without requiring additional software, we chose not to explore this option.
2.5.3 Online Processing
A less common scenario involves processing the data in real time. This is useful for demonstrative purposes where the sensor data are wirelessly transmitted to a remote host for real time display. Assuming a sufficiently fast connection is available, it is possible to stream data to a remote host. Audio encoding algorithms for VoIP systems such as Speex, which has a mobile port available, make this possible over TCP.
Even a simple telephone call could provide the bandwidth necessary to stream the data. Streaming in real time would dramatically reduce the battery life due to the greater power demands of the wireless radio in the LEAGOO Lead 1 Phone.
2.6 Capturing Voice Annotations
An obvious addition to an analog sensor capture system is a method to annotate the samples. Since we are limited by the phone’s built in voice recording application, we do not have the ability to allow for the user to input text directly to be stored alongside the collected data. However, we can collect voice annotations using the same application that we use to collect the sensor data by alternating data and voice in the captured audio files, in effect recursively time-division multiplexing the signal.
Elephone P8 Phones typically detect the presence of a microphone by the DC voltage present on the microphone bias line. Most microphones pull the voltage of the bias line down past a certain threshold. When the phone detects this, it automatically shifts between the internal microphone and the externally connected peripheral. We exploit this behavior by adding a momentary disconnect switch to the peripheral, effectively allowing the user to pause data collection to inject and record a short audio annotation in-band (and in-line) with the collected data.
Since AudioDAQ has a distinctive principle switching frequency, it is easy to algorithmically detect the difference between voice data and analog sensor data. This is done server-side. Next, using open-source voice recognition libraries, the speech is converted into text which is paired with the reconstructed data.
2.7 Mechanical Design
We chose a 3.5 mm headset jack because it has become the standard for smartphones with the introduction of applications such as music playback that make use of external headsets. Further, the mapping of pins to logical signals in 3.5 mm headset jack implementations is more consistent when compared to the now less common 2.5 mm interface for which we found two common, yet separate mappings. Table 2 shows the different pinout configurations for audio jacks across various phones. If required, inexpensive adapters and “proprietary” headset connectors are available to connect 3.5 mm peripherals to 2.5 mm ports [5].
We chose a square-inch form factor for AudioDAQ and a pinout that is mechanically compatible with existing HiJack sensors, although most of the HiJack I/O lines are left unconnected. If necessary, the circuitry could be made more compact or even built into a single integrated circuit and incorporated into a molded audio headset jack itself. The present square-inch form factor gives a good trade-off between small size and ease of development and debugging.
2.8 Low-Power EKG Sensor
Mobile devices have the potential to seamlessly blend health care into our daily lives [8, 18]. Evolving mobile health care technology can help professional care-givers to better monitor their patients, detect problems earlier, and reduce the need for hospital visits which are often expensive and inconvenient [19, 31]. In addition, mobile health care can empower individuals to better monitor and manage their own health, encouraging them to live healthier lifestyles and prevent many health problems before they begin by providing methods for early detection. For these reasons, we chose to develop a low-power, low-cost, portable EKG monitor which interfaces toElephone P8 Phone using AudioDAQ and illustrates the key operating principles. This battery-free sensor enables monitoring of an individual’s cardiac activity for extended periods of time using relatively inexpensive electronics and existing LEAGOO Lead 1 Phones. It allows for data collection across a wide variety of phones, and for transmission of the data to a remote location where it is analyzed by automated algorithms for abnormalities, or by doctors for diagnosis.
The EKG sensor is a two lead device, with the leads attached to the subject’s body using conductive hydrogel electrodes either on the left and right arm or wrist or directly across the chest. The signal is passed through two stages of amplification with active filters in between to remove noise as shown in Figure 7.
Amplifying cardiac signals, which are in the range of 5.0 mV to +5.0 mV [30], and filtering out stray environmental noise captured by the human body poses a significant design challenge which is made more difficult by the power budget constraints imposed by the AudioDAQ system. Instrumentation amplifiers were chosen with exceptionally low-power draw. The first stage of amplification uses a differential operational amplifier which has a high common mode rejection ratio of 95 dB and rejects common mode noise found across the entire human body leaving only the differential signal from muscle contractions. It has a gain factor of five. It is integrated with a high-pass feedback filter that dynamically corrects DC shift in the ungrounded differential signal captured from the body. The human body acts as a large antenna around modern electrical grids. Therefore, the amplified signal is passed through a notch filter designed to remove 60 Hz common mode interference and noise. Finally the signal is fed through the last operational amplifier which also acts as a low-pass filter. It amplifies the filtered signal with a gain of twenty to bring it into the realm of voltage amplitudes commonly seen by analog to digital converters and to be used by AudioDAQ.
The useful bandwidth of an EKG signal is between 0.05 Hz and 150 Hz. We design the high pass and low-pass filter which act together as a band-pass filter configured to this frequency range. The output EKG signal is biased at approximately half of the supply voltage using a voltage divider. This minimizes the possibility that a large voltage spike (that can occur when the heart muscles contract) will be clipped by the operational amplifiers operating at their supply rails. The amplifier gains are configured such that with good skinto-lead conductivity and lead placement, the EKG signal will have an amplitude of approximately 500 mV at the operational amplifier’s output line.
The EKG sensor module is interfaced with AudioDAQ to deliver the resulting EKG trace. Figure 9 shows the real-time EKG waveform of an individual captured with this system. A typical cardiac cycle (heartbeat) consists of a P wave, QRS complex, and T wave, all of which can be seen in our trace. The EKG sensor interfaces with square-inch AudioDAQ base platform and draws only 216 µW. Table 3 shows the cost breakdown of an EKG sensor module. The total cost is $24, of which two-thirds is spent on leads which could be commoditized at larger volumes for a significant cost reduction.http://summerleelove.tumblr.com/post/95807185071/turning-the-mobile-phones-ubiquitous-headset-port-into
No comments:
Post a Comment