readme no images, videos or comparisons

This commit is contained in:
indigo 2025-09-22 19:41:01 -05:00
parent b15afda232
commit 10627db180

View File

@ -1,57 +1,57 @@
# THIS README IS NOW OUTDATED. THE PROJECT IS WAY COOLER AND AN ARTICLE IS COMING SOON!
## I've implimented my own border tracing algorithim to dramatically speed up rendering.
There's a few other optimizations, such as shifting around image buffers, etc. <br>
## Also, the card is rechargable now.
Software is still in disarray and could be optimized, however it runs smoothly. Hardware is working, I haven't had time to test all functionallity though. The battery does charge and discharge safely (cut off at 1v).<br>
## Here's a crummy picture of the card.
![STM32 Card Prototype 2](https://git.bpcspace.com/indigo/stm32_business_card/raw/branch/main/imgs/new_card.jpg)
## A quick video of the border tracing in action- a better video coming soon.
![border tracing demo](https://git.bpcspace.com/indigo/stm32_business_card/raw/branch/main/imgs/border_demo.mkv)
# Writeup
## Computing the Mandelbrot Set
The mandelbrot set is a famous two-dimensional fractal known for its infinite complexity. Zooming past the famous bulbs, youre bound to find some extremely beautiful patterns if you process and color the data right. One thing that always interested me about the set is the simplicity of how the fractal is rendered. Its spooky how basic logic can create such complex patterns that will forever be explored by mathematicians. For this reason, graphing the mandelbrot set has always been my version of a “hello world”- a quick cool program I write when testing new languages, environments, etc. Ive implemented the fractal at least a dozen times.
Reading the wikipedia page is a good resource for understanding the mandelbrot fractal. In short, pixels on the screen are translated to complex numbers which are fed into a simple formula up to 255 times. If that number raises above a certain magnitude, the pixel is shaded based on iteration count and the next pixel is processed. Pixels that have a higher iteration count take longer to process.
I wanted my business card to allow users to explore the set with little to no wait time. I also wanted to allow people to zoom in pretty far so theyd never run out of new patterns to explore, requiring relatively high precision. Achieving this was more difficult than I anticipated due to the nature of the STM32F103.
The STM32F103 has strengths and weaknesses. It boasts a Arm Cortex M3 that can be clocked up to 72MHz (I ran at 62MHz using the internal oscillator). Theres single cycle 32 bit hardware multiplication. Memory access is instantaneous and only takes a single cycle. With the MCU being on the lower end of the STM32 lineup, theres also drawbacks. Theres no FPU; all operations to floats would have to be done in software. The MCU also has 20 KB of RAM, not even enough to hold a full framebuffer for my tiny display.
With no optimizations, the mandelbrot set took a few minutes to render on a 160x80 display. While waiting for my PCBs to arrive, I took a devkit I had on hand and started optimizing to chip down the time.
## Fixed Point Arithmetic
We lack an FPU. The compiler needs to implement expensive operations to handle floats in software, and operations such as multiplication become expensive. To take advantage of the multiplication hardware, I used fixed point arithmetic. The position of the decimal is hardcoded. Integers are simply shifted to the left by n bits, and bits before point n are considered the decimal. Multiplication is carried out as per usual, but the result will be scaled by 2^n. To fix this, well shift the result n bits to the right to scale it back. To prevent overflows during multiplication, I ended up multiplying with 64 bit integers, which was still much faster than float multiplication.
While experimenting with decimal placement, I shifted the decimal too much to the left. I havent taken the time to fully understand how, but this introduced some really aesthetically pleasing bugs. I dont fully comprehend why the bands form. Because numbers get clipped above a certain threshold, this decimal placement also creates copies of the mandelbrot set scattered throughout the complex plane in a way I think is really fun to explore.
At this point, the set rendered nearly instantaneously in locations that didnt include the black blobs of the set. Render time from decreased from minutes to a few seconds.
## Color
Color for each iteration count is computed beforehand to generate a lookup table for each iteration. This is faster than computing on the fly, as the STM32F1 has single cycle memory access time.
I didnt use the green channel, originally for aesthetic purposes. Later, as described in border tracing, I needed to store which pixels were rendered, and whether or not they were in the mandelbrot set. With my framebuffer and color lookup table, there was only about 500B of memory left, so I ended up using the green channel to store information. Using the green channel gave me a convenient way to access data on each pixel, and a conveniently visual way to debug.
<br><br><br><br><br><br><br>
# STM32 Mandelbrot Explorer Buisness Card
A battery powered business card that can explore the Mandelbrot set. Meant to be cheaply made for handing out. Will likely include a flappy bird clone to encourage recruiter competition.
# Project in development! See below for a brief write up.
![STM32 Card Prototype 1](https://git.bpcspace.com/indigo/stm32_business_card/raw/branch/main/preview.png)
## Framebuffer
Theres a lot of overhead to sending each pixel individually. The LCD controller requires coordinates for each drawing command, not to mention overhead of the SPI headers. The obvious solution is to keep a framebuffer and to send all data in one command. Due to limited memory, the framebuffer only contains half the screen; each half is rendered separately. Additionally, when panning the view, the framebuffer is simply shifted to avoid re-rendering parts already on the screen.
## Project structure
Code: [program/stm32f1_buisnesscard_v1/Core](https://git.bpcspace.com/indigo/stm32_business_card/src/branch/main/program/stm32f1_buisnesscard_v1/Core) <br>
PCB: [kicad/](https://git.bpcspace.com/indigo/stm32_business_card/src/branch/main/kicad)
## Border Tracing
The mandelbrot set is full; if a closed edge can be found, everything inside has the same number of iterations. This property can be exploited to dramatically decrease render times by searching for closed loops throughout the set. Because of the bugs introduced by fixed point arithmetic, I found this property to only remain true for points inside the set- points that require 255 iterations. Despite just appearing black, these were the points taking up a huge majority of the compute time, especially when zoomed out.
I decided to take a shot at border tracing myself. At this point, I had assembled my first working PCB, but developing and debugging the algorithm would be much easier on a computer. I quickly rewrote the code to work on my desktop using Raylib so I could debug faster- the STM32F1 contains only a few hardware breakpoints, and lacks the ability to set watchpoints.
## Software Development
The software is currently working, but is more a proof of concept (messy and unoptimized). In it's current state, I'm leaning heavily on ST's HAL to save time. For a project where everything is written from the ground up, see my other projects below [^1]!
The very basic principle of border detection is simple: when you run across a black pixel, go to the next pixel that is black but touching a colored pixel. While doing this, never backtrack. When you come across where you started, the curve is closed, and you can then shade the inside. Theres a few inherent problems to this- the largest is that were taking discrete samples of a fractal, where edges are infinitely complex. Very fine detail can make a curve look closed, when in reality theres a sliver thats just not visible due to sample resolution. Take below as an example. Theres blue spots enclosed within the edge.
![](writeup/closed_border.png)
### Display
I've modified [this repository](https://github.com/afiskon/stm32-st7735) to fit my needs. I've had to make some minor changes to utilize the ST7735's sleep mode, and to make the code work for my specific display. I expect I'll need to rewrite the library to manipulate raw registers to optimize SPI communication, as it currently uses HAL and is *very* slow. I'm considering implementing a game as well, which may require I access SPI via DMA.
### (No) FPU
The STM32F1 lineup doesn't have an FPU, thus fixed point integer arithmetic is used to speed up rendering. The decimal is intentionally too close to the MSB, as this introduces aesthetic visual artifacts at no cost that I think are a nice twist. <br>
*notice the banding around the set- that's an artifact of pushing fixed point arithmatic beyond its limits* <br>
<img src="./writeup/quick_buildings.png" alt="drawing" width="800"/> <br>
<img src="./writeup/quick_spiral.png" alt="drawing" width="800"/> <br>
One of the issues this can cause is that theres points where the algorithm sees two edges. Sometimes, the algorithm can corner itself as seen below. The obvious solution is to keep a list of points where theres two possible paths; when the algorithm traps itself, it can pop off the most recent location and keep running.
### MCU Power Consumption
The software is interrupt based, saving power any time there's not an active job. After 30 seconds, a MOSFET will turn off the backlight and the MCU will in a deeper sleep state, where only a few microamps are consumed. This allows the card to operate without a power switch, as theoretically the sleeping power consumption is insignificant compared to the shelf life of these batteries (a few months).
I got border tracing working by tracking pixel locations with multiple detectable edges. I spent a lot of time (days of actual working) avoiding this solution, as there was no way I could prove the list of pixels with multiple edges wouldnt overflow, and the algorithm could fail. Overflowing the array was literally never an issue, and the array only holds 32 potential locations.
## Hardware Development
The PCB works, and is what I'm using to test code. I'm planning on making some changes, including adding MOSFETS for the display backlight and creating some art for the silk screen.
The hardware is intentionally left simple for rapid development; there's two ideal diodes, allowing power delivery via USB-C or batteries, a step-down switching regulator, buttons with debouncing circuits, and of course, the MCU.
Some of the most vital components are described below.
To prevent pixels from being re-rendered, I stored information in the green channel of the image. My images never used green anyway, and I wouldnt have to eat any more memory (I only had ~500 bytes left). Using the green channel also gave me a conveniently visual way to debug.
### Batteries
This card will be powered by 4 zinc air batteries (cost effective & high power density, commonly used in hearing aids). The downside to these batteries is that they're non-rechargeable, and after activation, they only have a shelf life of a few weeks. Batteries are expensive, and what I've got is more then enough- it may actually be overkill.
A battery holder was developed to be used with springs to hold the batteries in place. I haven't ordered the batteries yet, so I have yet to see how it performs.
## Border Tracing Performance
### Display
The display is the cheapest 2$ 65k 160x80 color LCD I could find on LCSC. It is the most expensive part of my card, matched with batteries. I'm sure I could have found it on Ali Express for 50¢, but hindsight is 20/20. It's actually very pretty, and I plan to upload images soon.
## Hardware Design
Hardware was relatively straight forward. I miscalculated how much power was consumed by a power of 10- so my first PCB used non-rechargeable zinc-ion batteries, as this was the cheapest solution I could find. My first card worked without issues, as the only real circuitry was debouncing and a step-down switching regulator.
### SPI troubles
Currently, SPI only works consistently at 4mHZ. I have yet to probe it with my oscilloscope and figure out what's wrong; I'm guessing parasitic capacitance. My next PCB will have differential routing to allow higher speeds.
After discovering my error, I decided to make a more complex design that allowed recharging via USB-C for AAA batteries. To maximize battery longevity, a switching regulator was used for the step-up (battery) supply, while a simple linear regulator for USB-C. To avoid the batteries from discharging, I put a voltage supervisor to shut down the switching regulator at 1V. The switching regulator has true shutdown- blocking inputs when shut off. The voltage monitor uses a capacitor to time how long it should wait until turning back on above 1V. While finishing up the PCB I was tired and didnt care to calculate this value, so I just slapped a 1uF cap down- hence the card takes around 5 seconds to become powered after a battery is inserted.
## Things I'd change
I'm used to working with a lot lower spec MCUs, and after the purchase I've found that the price of the STM32f1 is actually quite expensive for it's performance. I might want to try something uber cheap for my next project requiring a higher speed 32 bit processor, as long as it's got a HAL to accelerate development. The ch32v003 looks pretty cool...
My second iteration worked almost flawlessly. If it was unplugged while charging, however, Id start running into issues. The USB bus was at 3v for some reason, powering components that should have been off. I found the ideal diode IC I was using as a power ORing circuit was latching on. Despite the name, ideal diodes dont actually block current flowing from output to input. The power would travel through the linear regulator, powering the nmos that enabled the ideal diode IC. This was fixed by simply adding a diode, as circled in magenta.
In the photo above, bat_3v was the power rail of the battery voltage regulator. Obviously, when the nmos is active, the signal wont drop to 0v as intended- the diode should be replaced with a resistor, and the 10k resistor should be replaced with a short. So I didnt have to order new PCBs, I simply substituted a 0Ω for the 10kΩ, and made a messy solder bridge to fit a 10kΩ across the diode pads.
When the switching regulator is used, my voltage briefly rises above 3.1v each duty cycle- a few mV above the maximum voltage for the LCDs backlight. I dont understand why I need a specific voltage for a backlight that is, in the end, just an LED- Id think the voltage drop should just be constant. Despite this, the screen functions without issues. Maybe the lifetime is shortened, but Im not making money and it works.
## Assembly
I used a hot air station to assemble the cards. The final PCBs were purchased a bit too close to the career fair, so a stencil would be very expensive to ship. To work around this, I just mixed my solder paste with flux. This was a mistake. While I get good results, each card takes a very long time to make. Maybe I was being too precise, but check out how many pads there are below. I was planning to hand them out at the fair, but Im now only planning on handing them out during interviews. Additionally, getting the pins of the MCU to align properly and remove the solder bridges with wick was a bitch and a half.
## Other Projects
[^1]: For another embedded project without any assistance from HAL, check out my (unfinished) [AVR wristwatch](https://git.bpcspace.com/indigo/AVRwristwatch), where everything- from the I2C display and RTC clock is developed from the ground up! <br>
For a much larger project that's not exactly embedded, check out my [Operating System](https://git.bpcspace.com/indigo/IndigoOS). I haven't been able to work on it since going to Missouri S&T, but some impressive feats include a bootloader, an efficient binary tree/buddy system physical memory allocator, and multi-core execution. It's a lot of code, maybe check it out!