From 48ec6e04e557bb0fd9edc4c95be3e82ab3ed6bc5 Mon Sep 17 00:00:00 2001 From: indigo Date: Mon, 23 Feb 2026 18:35:20 -0600 Subject: [PATCH] Update README.md updated some writing --- README.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index b5fc4ae..0ccbea5 100644 --- a/README.md +++ b/README.md @@ -22,7 +22,7 @@ The STM32F103 has strengths and weaknesses. It boasts a Arm Cortex M3 that can b With no optimizations, the mandelbrot set took a few minutes to render on a 160x80 display. While waiting for my PCBs to arrive, I took a devkit I had on hand and started optimizing to chip down the time. ## Fixed Point Arithmetic -We lack an FPU. The compiler needs to implement expensive operations to handle floats in software, and operations such as multiplication become expensive. To take advantage of the multiplication hardware, I used fixed point arithmetic. The position of the decimal is hardcoded. Integers are simply shifted to the left by n bits, and bits before point n are considered the decimal. Multiplication is carried out as per usual, but the result will be scaled by 2^n. To fix this, we’ll shift the result n bits to the right to scale it back. To prevent overflows during multiplication, I ended up multiplying with 64 bit integers, which was still much faster than float multiplication. +We lack an FPU. The compiler needs to implement expensive operations to handle floats in software, slowing render times to hours. To work around our lack of FPU, I used fixed point arithmetic, which also takes advantage of our single cycle multiplication time. The position of the decimal is hardcoded. Integers are simply shifted to the left by n bits, and bits before point n are considered the decimal. Multiplication is carried out as per usual, but the result will be scaled by 2^n. So, after multiplication, we shift the result n bits to the right to scale it back. To prevent overflows during multiplication, I ended up multiplying with 64 bit integers, which was still much faster than float multiplication. While experimenting with decimal placement, I shifted the decimal too much to the left. I haven’t taken the time to fully understand how, but this introduced some really aesthetically pleasing bugs. I don’t fully comprehend why the bands form. Because numbers get clipped above a certain threshold, this decimal placement also creates copies of the mandelbrot set scattered throughout the complex plane in a way I think is really fun to explore. @@ -41,7 +41,7 @@ I didn’t use the green channel, originally for aesthetic purposes. Later, as d There’s a lot of overhead to sending each pixel individually. The LCD controller requires coordinates for each drawing command, not to mention overhead of the SPI headers. The obvious solution is to keep a framebuffer and to send all data in one command. Due to limited memory, the framebuffer only contains half the screen; each half is rendered separately. Additionally, when panning the view, the framebuffer is simply shifted to avoid re-rendering parts already on the screen. ## Border Tracing -The mandelbrot set is full; if a closed edge can be found, everything inside has the same number of iterations. This property can be exploited to dramatically decrease render times by searching for closed loops throughout the set. Because of the bugs introduced by fixed point arithmetic, I found this property to only remain true for points inside the set- points that require 255 iterations. Despite just appearing black, these were the points taking up a huge majority of the compute time, especially when zoomed out. +The mandelbrot set is full; if a closed edge can be found, everything inside has the same number of iterations. This property can be exploited to dramatically decrease render times by searching for closed loops throughout the set. Because of the bugs introduced by fixed point arithmetic, I found this property to only remain true for points inside the set- points that require 255 iterations. Despite just appearing black, these points often take a majority of the compute time, especially when zoomed out. I decided to take a shot at border tracing myself. At this point, I had assembled my first working PCB, but developing and debugging the algorithm would be much easier on a computer. I quickly rewrote the code to work on my desktop using Raylib so I could debug faster- the STM32F1 contains only a few hardware breakpoints, and lacks the ability to set watchpoints. @@ -66,13 +66,13 @@ Hardware was relatively straight forward. Every iteration fit on a 2 layer PCB. After discovering my error, I decided to make a more complex design that allowed recharging via USB-C for AAA batteries. To maximize battery longevity, a switching regulator was used for the step-up (battery) supply, while a simple linear regulator for USB-C. To avoid the batteries from discharging, I put a voltage supervisor to shut down the switching regulator at 1V. The switching regulator has true shutdown- blocking inputs when shut off. The voltage monitor uses a capacitor to time how long it should wait until turning back on above 1V. While finishing up the PCB I was tired and didn’t care to calculate this value, so I just slapped a 1uF cap down- hence the card takes around 5 seconds to become powered after a battery is inserted. Two ideal diode ICs are used for supply ORing so the supplies aren't in conflict while charging. -My second iteration worked almost flawlessly. If it was unplugged while charging, however, I’d start running into issues. The USB bus was at 3v for some reason, powering components that should have been off. I found the ideal diode IC I was using as a power ORing circuit was latching on. Despite the name, ideal diodes don’t actually block current flowing from output to input. The power would travel through the linear regulator, powering the nmos that enabled the ideal diode IC. This was fixed by simply adding a diode, as circled in magenta. +My second iteration worked almost flawlessly. If it was unplugged while charging, however, I’d start running into issues. The USB bus was at 3v for some reason, powering components that should have been off. I found the ideal diode IC I was using as a power ORing circuit was latching on. Despite the name, "ideal diode" ICs don’t actually block current flowing from output to input. The power would travel through the linear regulator, powering the nmos that enabled the ideal diode IC. This was fixed by simply adding a diode, as circled in magenta. I was able to fit the diode on the PCB via hand-made modifications. ![](writeup/bad_curcuit.png) In the photo above, bat_3v was the power rail of the battery voltage regulator. Obviously, when the nmos is active, the signal won’t drop to 0v as intended- the diode should be replaced with a resistor, and the 10k resistor should be replaced with a short. So I didn’t have to order new PCBs, I simply substituted a 0Ω for the 10kΩ, and made a messy solder bridge to fit a 10kΩ across the diode pads. -When the switching regulator is used, my voltage briefly rises above 3.1v each duty cycle- a few mV above the maximum voltage for the LCD’s backlight. I don’t understand why I need a specific voltage for a backlight that is, in the end, just an LED- I’d think the voltage drop should just be constant. Despite this, the screen functions without issues. Maybe the lifetime is shortened, but I’m not making money and it works. +I misunderstood the datasheet for the display's backlight, and thought it had to be run at exactly 2.9-3.1V. This is why my whole card runs at 3v. In reality, it's just an LED with a forward voltage of 3V. Because we power the backlight at exactly the forward voltage, the backlight doesn't burn itself out. However, the voltage is briefly above the forward voltage while the switching regulator powers on- probably not great for longevity. ## Assembly I used a hot air station to assemble the cards. The final PCBs were purchased a bit too close to the career fair, so a stencil would be very expensive to ship. To work around this, I just mixed my solder paste with flux. This was a mistake. While I get good results, each card takes a very long time to make. Maybe I was being too precise, but check out how many pads there are below. I was planning to hand them out at the fair, but I’m now only planning on handing them out during interviews. Additionally, getting the pins of the MCU to align properly and remove the solder bridges with wick was excruciating.