Home -- News: 7 April 2020 -- Videos -- Source Code -- LinkedIn -- Email -- About

All the Source Code in one Place

An attempt to gather all source code found on this website in one place, neatly zipped. Work in progress!

Source code license for nonderivative works, unless clearly labelled otherwise: CC0 1.0.

Source code from assorted articles

imagetrans_2019.zip - Article: A Different Method for Image Transformations
C, Intel SSE2 intrinsics, ARM NEON assembler.
I'm pretty certain that the method used is unique: Fit all data in L1 cache and rotate in place to make the horizontal pass simpler. Feel free to rip it off. The ARM NEON assembler version is written as inline assembler. It wasn't a stellar idea. I spent some time deciding how to stack the instructions. Didn't seem to matter on the TX1.

xmasdemo_2017.zip - Article: Nvidia Jetson TX2 Xmas Demo 2017
C, ARM NEON intrinsics, GLSL.
Several dubious GLSL optimization techniques are used to make this demo run in 60 fps on the TX2 256-core GPU. If you're looking for precise math, look elsewhere. Close enough will have to do on that GPU! Everything should fly by in 60 fps. Most of it is based on GPU code found on Shadertoy.

tilegx_int_raytracer_2017.zip - Article: Integer Raytracing on Mellanox TILE-Gx
C, TILE-Gx intrinsics.
I looked at some assorted integer libraries available on the interwebs. They sucked. The TILE-Gx multicore CPU has some floating-point support, so I combine those with some classic optimizations from Doom. I also wrote a new parallelization library that uses the internal network for blisteringly fast communication between the cores.

gpuray_2017.zip - Article: GPURay - GPU+CPU Raytracer for NVidia Tegra X1
I once attended a course about writing code for GPUs. They guy had no clue about how GPUs work. This crap runs fast as hell, by exploiting the fact that the CPU can predict how things are gonna execute.

srpt_aes_revisited_2016.zip - Article: SRTP AES Optimization Revisited
This is getting old by now, but it simplifies the AES CTR mode constant buffer calculations by eliminating the common terms. It runs much faster.

gpu_hacks_2016.zip - Article: GPU Hacks: Fractals, a Raytracer and Raytraced Quaternion Julia Sets
Early GPU work.

tilegx_aes_openssl_2015.zip - Article: OpenSSL aes_core.c Replacement for EZchip TILE-Gx
C, TILE-Gx intrinsics.
I love the TILE-Gx CPU! This thing can replace the OpenSSL AES encrypt function by using the mega-awesome tblidx instructions. Also, I redid the final pass with a 64-bit merge. This could probably be done on x64-based CPUs too. Haven't seen it yet. Feel free to rip it off.

multiquake_2014.zip - Article: MultiQuake - Quake for Mellanox TILE-Gx CPUs
C, TILE-Gx intrinsics.
Quake for the TILE-Gx multicore CPU. The 36-core version has no problems running 1 per core.

multidoom_2014.zip - MultiDoom - Doom for Mellanox TILE-Gx CPUs
C, TILE-Gx intrinsics.
My first foray into programming the TILE-Gx multicore CPU. The 36-core version has no problems running 1 Doom per core. I also had a version running on the TILE64 PCIe development board before that.

homepilotsrc_1999.zip - Article: Source Code and Schematics for PCTVNet HomePilot Set Top Box
C, x86 assembler.

It's a functional compiler that generates 68000 code, written in C++ version 2 in 1996.

C++ and 68020 assembler.
Features a raytracer written by a friend of mine.

68000 assembler. 68020, to be precise. I recovered this source in 2020 and fixed an old timing bug, so it should run smoothly now. The fix was a bit crap, but I haven't written 68k in 20 years so err...

68000 assembler.
I didn't write any of this! The code was done by Warp and Smeagol.

Source code from the Pipeline section

From 27 June 2016.

Early threaded raytracer that splits the picture into lumps of lines. Wasn't a very bright idea.