VE4BRE amateur radio

Synthesizer Encoder

Summary

I have created an interactive synthesizer program for the raspberry pi pico microcontroller. It’s capable of accurate frequency- and phase-shift keying, and I’ve confirmed its ability to generate intelligible Q65 and RTTY messages, relying on external software to provide the symbol sequence.

Status and Future Work

I am still testing the phase-shift keying with BPSK31, which is the only style of phase-shift keying currently implemented. The only thing required to implement arbitrary phase-shift keying would be a different method for interpolating between the symbols. It currently sets an initial phase and a rotation for each symbol, and relies on ring modulation and accurate symbol timing to provide a cosine filter between the symbol phases. It would be nice to use a FIR filter for phase interpolation, but I’m not sure that would perform fast enough for the 8KHz sample rate.

For the Frequency-shift keying modes, they all seem to be working well. They are not using any shaping filters, though they keep the output signal continuous.

It would be nice to include the message encoding capability in the program. The encoding dictionaries should fit well in the pi-pico’s generous RAM.

Motivation

For future optical communication experiments, I needed a synthesizer that has output suitable for driving LEDs. I had done some previous experimentation with this in my morse keyer project. High-frequency PWM (192kHz) seems to work well through a darlington array to an LED bank. The duty cycle of the outputneeds to be varied at a rate that is at least double the highest frequency to be reproduced. I chose 8000 Hz, since typical soundcard modes have bandwidths within 3000 Hz, so it would need to be at least 6000 and a little higher than that is a little better.

Feature Requirements

Decodable by sound card and standard software

For testing purposes, as well as field use, popular software such as FLDigi and WSJT-x should be capable of decoding the signals once they’ve been converted to line-level audio-frequency input. This conversion should only require simple filtering and attenuation or amplification.

Interactive generation of signals

The program shall generate signals when directed by a controlling interface. Control messages shall be abstracted to the symbol or message level, with the program being responsible for the conversion of symbol indices into the appropriate output.

E.g.

q65 = Q65CodeBank(1000.0, 15, 'A')
q65.send_symbols([0,15,0,20...

Weak signal mode encoding

Since I am motivated by experimental optical communication, various and popular weak-signal digital modes are a priority. My understanding of the properties of optical channels is very limited, but I know Q65 has seen some success due to W1VLF’s youtube videos. I decided to include RTTY because it’s also FSK, but it’s a keyboard mode and not reliant on time synchronization.

PSK31 is included because of its high efficiency, and I suspect optical channel conditions will support it well (no multipath fading for LED intenisty modulation). It can also be set to a frequency that makes it easy to spot with shutter-speed beating on a smartphone.

Implementation

The Micropython development environment is very convenient for this project, because it provides an interactive interface on the pi pico’s USB serial port. This interface is the micropython REPL (a kind of command prompt). Additionally, the language’s generator features are convenient for message and symbol processing. I make considerable use of the micropython.viper keyword, which allows for machine-code compilation of some highly constrained code. This optimization was necessary in order to get acceptable performance in the interrupt handler.

IQ Vector synthesis

The standard library of trigonometric functions applies its sine and cosine functions against floating point values. Lacking a floating point unit, this performs very poorly on the pi pico, with an average execution time of 293 microseconds, which is far too long for the 125 microsecond budget that’s available with a 8000 Hz sample rate.

Being familiar with CORDIC methods, I realized that the advancing of phase in complex number representation allows me to get the cosine for free by just using the real part. As long as the phase advancement vectors are computed ahead of time (as they can be in FSK and PSK modes), it’s hard to imagine anything more efficient than a single complex number multiplication. This will need to use fixed-point arithmetic, however.

Fixed point arithmetic

The fixed point format i17f15 (the first 17 bits being the integer part, the last 15 bits being the fractional part) was chosen because it provides the maximum resolution for values -1.0 to 1.0 without risk of overflow when multiplying. In order to feed these values to the PWM duty cycle, they simply add (1«15) and mask with 0x0000FFFF which becomes values 0 to 65535 in an unsigned short representation, which uses the full dynamic range of the PWM device on the pi pico.

Buffering

I had experimented with the use of a ring buffer to hold several symbols worth of rotation vectors, but there was a noticeable performance penalty to moving the values from the ring buffer to the working buffer where the interrupt handler would do its arithmetic. The interrupt handler would count the samples and know when to pull from the ring buffer, and occasionally spend longer than usual copying over that data and adjusting the ring buffer indices. I much prefer that the interrupt handler take a consistent amount of time to whatever extent possible. So, I decided to switch to a simple double-buffer technique, where the interrupt handler uses one of two buffers, and the assignment of symbol-related data to the buffers is left to the encoder subroutines, which are not as time-critical as the interrupt handler.

The encoder subroutine simply waits until the interrupt handler starts using the most-recently written buffer, then switches to the next buffer and writes its symbol-related data there.

The interrupt handler counts the number of samples it has produced for the current symbol and switches to the next buffer when it has reached the required number of samples. If the next buffer is not ready, it switches to a default output. The little bit of branching where it does index arithmetic and a bit of assignment into its working buffer (which should already be in the dirty cache) seems to add minimal performance penalty. The duration of the method when switching buffers is 2 microseconds longer than when it is not switching buffers.

The downside to this approach is that the encoder subroutines have to stay in lock-step with the interrupt handler, and have to poll for buffer availability. It would be nice to use the asyncio module to do some signalling and avoid the polling and possibly free up the REPL, too. It would need to be some creative use of that library to get around the restrictions in hardware interrupt handlers.

Encoders

Q65

I used the table of frequency spacings and symbol timings from the “Quick-Start Guide to Q65” published in 2021 by Joe Taylor, K1JT; Bill Somerville, G4WJS; Steve Franke, K9AN; and Nico Palermo, IV3NWV

The q65code program is used to generate symbol index sequences from standard messages like “VE4MA VE4BRE R-12”

RTTY

I had trouble finding the specifications for the most common rtty in use today, but through experimentation, I discoverd that a starting space and a stopping mark need to be used, and that the code words must be serialized in “least-significant bit first” order. This allowed the messages to be decoded by FLDigi. I generated the sequences of code words using web-based tools.

Performance

I wrote a ‘profile()’ method to examine the performance of the most critical parts of the synthesizer. The hardware interrupt handler that’s connected to the 8000 Hz timer only had 125 microseconds to complete, and it needed to leave ample room for the encoder subroutines to complete their work before the next symbol period began. Through some careful use of micropython.viper pointer types, I was able to write very fast complex vector arithmetic methods for multiplication and normalization.

The interrupt handler performs two complex multiplications and two normalizations and basic arithmetic to produce the PWM duty cycle and feed it to the PWM peripheral. This takes about 52 microseconds, which is roughly 41.6% cpu busy time on one of the cores.

The encoder subroutines only need to do their work at the symbol rate, which is generally on the order of several milliseconds. Performance here has not yet become an issue, and I’ve been happy to use floating point arithmetic and standard library methods.

Output quality

The output quality seems to be very high, based on the clean appearance of waterfalls and the ease with which WSJT-x and FLDigi decode the signals. Additionally, I have used a rolling shutter effect to capture images on my smartphone of the output from an LED. These images show a smooth sinusoid in the direction of the rolling shutter, which indicates that the PWM output is indeed changing duty cycle as I expect. During debugging, this technique has been extremely helpful for discovering errors caused by overflow and buffer corruption.

Example of Rolling Shutter analysis