Source code protection

Hi, I am playing around with a pico and prototyping an idea i have. Its all very new to me, I come from a software engineering background creating compiled desktop applications where its generally pretty easy to hide your IP from the average jo blow. If my idea works and i produce a device based on a pico, (low volume less than 10 units a year) what is to stop someone simply pulling it apart, plugging the pico into thonny and stealing the code?

thanks
Darren

3 Likes

Hi Darren,

Welcome to the forum!!

Micropython might not be your best option then, I’d take a look at Arduino as its a compiled language.
Other microcontrollers allow for a fuse to be set so that program code cant be read off but this isnt the case for the RP2040 (it was designed as a board for education rather than industrial level reliability - they make mention of it in the datasheets somewhere).

1 Like

Thank you Liam… yes i will have to rethink it for production I guess. I’ll prototype in micropython (as im using the picodev transceivers) and once im convinced its worthwhile i could then either look at other RF modules with c headers or port it across to audino. I’m having fun playing with the hardware and looks like there are lots of options to skin the cat :slight_smile:

1 Like

Hi Darren,

Good luck with your project, don’t be alarmed if you have to go through several iterations as you put it there are many ways to skin a cat.

Look forward to hearing how you go!

2 Likes

Hi @Liam120347.
If someone used a HAL and a compiled language with a RP2040 would someone be able to reach in a pull the machine code?

1 Like

Hi Pixmusix,

In the case of the Pico (and any RP2040 implementation), code is stored on an external flash IC. We actually stock these so you can take a look!

There’s a good manual on building a board around the RP2040 that explains it well:

The downside of this is that it’s stupid easy to read code off, you just pretend to be a host microcontroller and the flash IC will dutifully read out its brains.

My understanding is that solutions like ARM TrustZone can encrypt your code, but I’ve never used it and it’s usually reserved for fancier microcontrollers.

3 Likes

Oh really!? Brutal! So it could be micro-python, assembly, or machine code… the IC will surrender it all the same. I didn’t know that. Thanks @James

3 Likes

Hi Pix,

Most programs can be figured out quick enough if the person wants to copy the project.

In this day and age, if a person can load a file from an embeded device they probably have the expertise to reverse engineer the program: How to rip code off an Arduino? - #3 by oyan244 - Project Guidance - Arduino Forum

A heads up, if you are making a product that needs to be reliable its best to understand every object in the system - while PiicoDev is a great line of products, there are many points that it could fail if its put in an industrial environment (thats not to say they are designed poorly, but that if they are implemented a very close eye should be passed over them, to the point that you practically redesign the modules).

2 Likes

How do companies and businesses with propriety software or products protect there intellectual property?

1 Like

Think BIG lawsuits - such that it becomes unfinancial to try …

2 Likes

Hi Liam

Very good advice.

I remember years ago some customers, particularly government and agencies or where higher reliability was a requirement, there used to be (and probably still is) a number that had to be quoted called MTBF (Mean Time Between Failure) which is simply what the name implies. Sometimes this was part of a design specification. Probably in this age of some pretty powerful computers not such an onerous task but pre computer days it was a very time consuming job. The nitty gritty of every component had to be backtracked back to manufacture and a figure for the complete item arrived at and was expected to be met.

One of the last major projects I was involved with (some 25 years ago now) before ceasing full time work (due to wife’s ill health) had another one thrown in. MTTR which was Mean Time To Repair which seemed to be a bit hairy. Fortunately the quality of technical people in this branch of the industry is quite high so the condition could be met with reasonable certainty. By the way this MTTR was 30 minutes. I had to rethink the layout of most of the equipment rack cabinets because the cooling fan in one piece of equipment could not be changed in this time with the original arrangement.
Cheers Bob

1 Like

Hi Daren,
I have never tried it but you can also write programs for the pico using C/C++
Here is a link

You would compile your code and that is installed on the pico and you keep control of the source code. Someone could of course decompile it but is not a simple task. There are also methods to obfuscate C code to make it even harder do understand once decompiled.

Let us know if you choose to go down this path.
David

4 Likes

PIC processors store programs in on-chip flash memory. Setting the CP (code protect) bit disables the ability of external programmers to read or write the flash memory. The CP bit can only be unset by a bulk erase, which erases all memory. Maybe there’s a PIC processor that fits your requirements. Or another processor that implements the equivalent of the CP bit.

1 Like

I too have a history of writing Windows apps in Visual C++ (for instrumentation purposes), and in more recent years writing for microcontrollers (especially PIC32 programmed in C), sometimes combined with a Raspberry Pi (programmed in C/C++ for speed (much faster than Python if you need the speed), with the GTK3 GUI framework)

As others have said, an important first step in protecting your source code is to write the firmware in a Compiled language such as C, C++, or the Arduino flavour of C (which like the other C/C++ compilers also supports multiple source-code modules as .cpp and .h files if needed for larger, more complex programs).

In 32-bit microcontrollers such as the PIC32, note that some models include Encryption hardware. They also have up to 2MB flash memory and 512kB or more of RAM. They also have Read-protect bits like the other processors mentioned above.

One idea I’ve read of for more extreme IP protection with these microcontrollers is to have an Encrypted version of the compiled code in flash, and then the bootloader decrypts it into RAM as the micro starts up. I have not tried this. It sounds tricky to implement, and would only be worthwhile bothering if a higher level of IP protection was needed, beyond just compiling the code.

Regarding Microchip’s microcontrollers such as their PIC32 (and ARM) offerings, I have found their ‘Harmony 3’ framework very helpful for non-Graphics applications. In order to learn how to use Harmony 3, there is free training material available at the ‘Microchip University’ site:

I found some of the Harmony 3 and other training material there very helpful when I first used PIC32MZ-EF microcontrollers (which by the way include a floating-point processor) for some designs over the last couple of years.

Best wishes with your projects.

1 Like

At the end of the day, not much. There’s always a way to be able to read the flash off of a microcontroller of any description, it’s just a matter of difficulty. Even if there’s a method such as single-program burnable fuses, protection bits in place, or even if you compile the code, there’s almost always a way to read what’s in memory or approximate the source code by reverse engineering the compiled instructions.

There’s even several studies about using electron scanning microscopes, X-Rays, or similar techniques (although obviously for the most part this is quite inaccessible to the average joe and hardly worth the effort, just an interesting point about physical security):

https://www.researchgate.net/publication/312551555_Reverse_engineering_Flash_EEPROM_memories_using_Scanning_Electron_Microscopy

One trick that you can use however to make it much more difficult once they’ve got the source is obfuscation. PyArmour is probably the most well known module for it but many others exist too:

Essentially obfuscation leaves the logic (and often the optimization too) untouched, but replaces every identifier with some randomised content, shuffles reorderable code blocks, scrambles the logic to make it much more difficult to interpret, and uses many little tricks so that even if you’ve got the source code, other than run it it’ll be practically impossible to use it for anything. Here’s a little snippet you can run in your browser as a demo (hit F12, select console, paste the code in, then type main() to run it)

Original

const main = () => {
    const answer = prompt("What is five times nine?")
    if (answer === "at least forty") {
        console.log("That is exactly correct")
    }
}

Simple Obfuscation Techniques

var _0xaBc12 = "console", _0x4Bf6E = "log", _0xOe91X = "What is five times nine?", _0x5aD55 = "at least forty";var main = function() {var _0xElm4R = prompt(_0xOe91X);var _0xNoOp1 = function() {}; _0xNoOp1();var _0xNoOp2 = function() {}; _0xNoOp2();if(_0xElm4R === _0x5aD55) {window[_0xaBc12][_0x4Bf6E]("That is exactly correct");}
try { null[0](); } catch(e) {}};

Now obviously my example above it quite simple, but should present the idea of what’s going on and how it could be done.