Support for HID-based partial super-fast flashing (#523)

* fix bug * Fixed an issue where the Game of Life menu item was not appearing (#497) * Starting on dapjs flashing * Adding dapjs * Connected * Flashing works * Double buffer flashing * Add SHA computation function * Run SHA code * Swap SHA for murmur+crc * Switch to dual murmur3 * Partial flashing works * Remove unused code * Move flashing code to external/sha * Fix whitespace * Cleanup binary genration scripts * Add docs for hid flashing * bump pxt-core to 0.12.132,
2017-09-18 09:45:27 -07:00
parent bd291854fd
commit 5a6f96af69
14 changed files with 4216 additions and 24 deletions
@@ -17,6 +17,7 @@ clients/**/bin/**
 clients/**/obj/**
 electron-out
 hexcache
 build
 *.user
 *.sw?
@@ -306,3 +306,4 @@
     * [Servo](/device/servo)
     * [Simulator](/device/simulator)
     * [Usb](/device/usb)
     * [Flashing via HID (CMSIS-DAP)](/hidflash)
@@ -29,4 +29,5 @@
 * [Command Line Interface](/cli)
 * Learn about [packages](/packages)
 * [Flashing via HID (CMSIS-DAP)](/hidflash)
@@ -0,0 +1,63 @@
 # Flashing via HID (CMSIS-DAP)
 When the web app has access to a HID connection to the board, it can flash
 the board via the hardware debugger interface.
 The PXT localhost server can proxy HID connections (over a WebSocket),
 and native apps can access HID via various custom APIs (which are
 likely to have lower latency than the HID proxy).
 This is generally done via
 writing a little flashing program to the RAM, then writing the page to be
 flashed to the RAM, and then running the program. For next page, one keeps
 the flashing program, but replaces the data. Internally, the DAPLink
 software on the @boardname@ does the same.
 The flashing via DAP over HID is quite a bit slower than the regular
 drag&drop kind. This is because of overheads of the DAP protocol
 and the limited throughput of HID (1 packet, of maximum 64 bytes, per millisecond).
 Additionally, the DAP protocol requires every HID packet to be acknowledged
 effectively halving the bandwidth.
 Thus, typical flashing speeds (using HID proxy) are around 14k/s, with a typical
 full flash taking 15s. Theoretical maximum is around 25k/s.
 A custom flashing protocol, like [HF2](https://github.com/Microsoft/uf2/blob/master/hf2.md),
 can achieve around 60k/s, however this would require updates of DAPLink software,
 and is still not very fast.
 ## Partial flashing
 Instead, we take care to only flash the pages that have changed.
 In typical software development only a very small fragment of the program 
 changes on every re-deployment. Additionally, most of the program is pretty
 much constant (two bootloaders, the softdevice, and the compiled C++ runtime).
 This is achieved by first deploying a small program which computes
 checksums of every page. Then, these checksums are read from the device
 and compared with checksums of pages of the `.hex` file to be deployed.
 Only pages which checksums that do not match are flashed.
 The particular checksum algorithm used is [Murmur3](https://en.wikipedia.org/wiki/MurmurHash#MurmurHash3).
 The algorithm is simplified by removing checks for unaligned data, or hashing
 the data length, since all blocks hashed are of the same, aligned length.
 The Murmur3 hash was chosen since it's very fast (around 4x faster than CRC32 and around 
 15x faster than SHA256). Hashing the entire flash takes about 200ms.
 In fact, two 32 bit Murmur3 hashes (using different starting seeds) are computed in 
 parallel, to produce a 64 bit checksum.
 ## Hash length analysis
 Let's compute the probability of some people running into trouble because
 of hash collisions on pages. Assume:
 * uniform distribution of hashes
 * 10M users
 * each user programming for 50h, and flashing every 2 minutes, i.e., 1500 flashes
 * each flashing changing 10 pages
 With 64 bit hashes, the probability that a collision occurs
 `1 - ((2^64 - 1) / 2^64) ^ (1e7 * 1500 * 10)` which is `0.000016`
 (Bing says so; Google wrongly said `0`).
 With 32 bit, even with only 1M users we get `97%` probability of some collisions.
 The uniformity of hashes is questionable, but the `0.000016`
 gives us some wiggle room.
@@ -0,0 +1,421 @@
 declare namespace DapJS {
    export interface IHID {
        write(data: ArrayBuffer): Promise<void>;
        read(): Promise<Uint8Array>;
        close(): Promise<void>;
        // sends each of commands and expects one packet in response
        // this makes for better performance when HID access is proxied
        sendMany?(commands: Uint8Array[]): Promise<Uint8Array[]>;
    }
    export class DAP {
        constructor(device: IHID);
        reconnect(): Promise<void>;
        init(): Promise<void>;
        close(): Promise<void>;
    }
    /**
     * # Memory Interface
     *
     * Controls access to the target's memory.
     *
     * ## Usage
     *
     * Using an instance of `CortexM`, as described before, we can simply read and
     * write numbers to memory as follows:
     *
     * ```typescript
     * const mem = core.memory;
     *
     * // NOTE: the address parameter must be word (4-byte) aligned.
     * await mem.write32(0x200000, 12345);
     * const val = await mem.read32(0x200000);
     *
     * // val === 12345
     *
     * // NOTE: the address parameter must be half-word (2-byte) aligned
     * await mem.write16(0x2000002, 65534);
     * const val16 = await mem.read16(0x2000002);
     *
     * // val16 === 65534
     * ```
     *
     * To write a larger block of memory, we can use `readBlock` and `writeBlock`. Again,
     * these blocks must be written to word-aligned addresses in memory.
     *
     * ```typescript
     * const data = new Uint32Array([0x1234, 0x5678, 0x9ABC, 0xDEF0]);
     * await mem.writeBlock(0x200000, data);
     *
     * const readData = await mem.readBlock(0x200000, data.length, 0x100);
     * ```
     *
     * ## See also
     *
     * `PreparedMemoryCommand` provides an equivalent API with better performance (in some
     * cases) by enabling batched memory operations.
     */
    export class Memory {
        private dev;
        constructor(dev: DAP);
        /**
         * Write a 32-bit word to the specified (word-aligned) memory address.
         *
         * @param addr Memory address to write to
         * @param data Data to write (values above 2**32 will be truncated)
         */
        write32(addr: number, data: number): Promise<void>;
        /**
         * Write a 16-bit word to the specified (half word-aligned) memory address.
         *
         * @param addr Memory address to write to
         * @param data Data to write (values above 2**16 will be truncated)
         */
        write16(addr: number, data: number): Promise<void>;
        /**
         * Read a 32-bit word from the specified (word-aligned) memory address.
         *
         * @param addr Memory address to read from.
         */
        read32(addr: number): Promise<number>;
        /**
         * Read a 16-bit word from the specified (half word-aligned) memory address.
         *
         * @param addr Memory address to read from.
         */
        read16(addr: number): Promise<number>;
        /**
         * Reads a block of memory from the specified memory address.
         *
         * @param addr Address to read from
         * @param words Number of words to read
         * @param pageSize Memory page size
         */
        readBlock(addr: number, words: number, pageSize: number): Promise<Uint8Array>;
        /**
         * Write a block of memory to the specified memory address.
         *
         * @param addr Memory address to write to.
         * @param words Array of 32-bit words to write to memory.
         */
        writeBlock(addr: number, words: Uint32Array): Promise<void>;
        private readBlockCore(addr, words);
        private writeBlockCore(addr, words);
    }
    /**
 * # Cortex M
 *
 * Manages access to a CPU core, and its associated memory and debug functionality.
 *
 * > **NOTE:** all of the methods that involve interaction with the CPU core
 * > are asynchronous, so must be `await`ed, or explicitly handled as a Promise.
 *
 * ## Usage
 *
 * First, let's create an instance of `CortexM`, using an associated _Debug Access
 * Port_ (DAP) instance that we created earlier.
 *
 * ```typescript
 * const core = new CortexM(dap);
 * ```
 *
 * Now, we can halt and resume the core just like this:
 *
 * > **NOTE:** If you're not using ES2017, you can replace the use of `async` and
 * > `await` with direct use of Promises. These examples also need to be run within
 * > an `async` function for `async` to be used.
 *
 * ```typescript
 * await core.halt();
 * await core.resume();
 * ```
 *
 * Resetting the core is just as easy:
 *
 * ```typescript
 * await core.reset();
 * ```
 *
 * You can even halt immediately after reset:
 *
 * ```typescript
 * await core.reset(true);
 * ```
 *
 * We can also read and write 32-bit values to/from core registers:
 *
 * ```typescript
 * const sp = await core.readCoreRegister(CortexReg.SP);
 *
 * await core.writeCoreRegister(CortexReg.R0, 0x1000);
 * await core.writeCoreRegister(CortexReg.PC, 0x1234);
 * ```
 *
 * ### See also
 *
 * For details on debugging and memory features, see the documentation for
 * `Debug` and `Memory`.
 */
    export class CortexM {
        /**
         * Read and write to on-chip memory associated with this CPU core.
         */
        memory: Memory;
        /**
         * Control the CPU's debugging features.
         */
        debug: Debug;
        /**
         * Underlying Debug Access Port (DAP).
         */
        private dev;
        constructor(device: DAP);
        /**
         * Initialise the debug access port on the device, and read the device type.
         */
        init(): Promise<void>;
        /**
         * Read the current state of the CPU.
         *
         * @returns A member of the `CoreState` enum corresponding to the current status of the CPU.
         */
        getState(): Promise<CoreState>;
        /**
         * Read a core register from the CPU (e.g. r0...r15, pc, sp, lr, s0...)
         *
         * @param no Member of the `CortexReg` enum - an ARM Cortex CPU general-purpose register.
         */
        readCoreRegister(no: CortexReg): Promise<number>;
        /**
         * Write a 32-bit word to the specified CPU general-purpose register.
         *
         * @param no Member of the `CortexReg` enum - an ARM Cortex CPU general-purpose register.
         * @param val Value to be written.
         */
        writeCoreRegister(no: CortexReg, val: number): Promise<void>;
        /**
         * Halt the CPU core.
         */
        halt(): Promise<void>;
        /**
         * Resume the CPU core.
         */
        resume(): Promise<void>;
        /**
         * Find out whether the CPU is halted.
         */
        isHalted(): Promise<boolean>;
        /**
         * Read the current status of the CPU.
         *
         * @returns Object containing the contents of the `DHCSR` register, the `DFSR` register, and a boolean value
         * stating the current halted state of the CPU.
         */
        status(): Promise<{
            dfsr: number;
            dhscr: number;
            isHalted: boolean;
        }>;
        /**
         * Reset the CPU core. This currently does a software reset - it is also technically possible to perform a 'hard'
         * reset using the reset pin from the debugger.
         */
        reset(halt?: boolean): Promise<void>;
        /**
         * Run specified machine code natively on the device. Assumes usual C calling conventions
         * - returns the value of r0 once the program has terminated. The program _must_ terminate
         * in order for this function to return. This can be achieved by placing a `bkpt`
         * instruction at the end of the function.
         *
         * @param code array containing the machine code (32-bit words).
         * @param address memory address at which to place the code.
         * @param pc initial value of the program counter.
         * @param lr initial value of the link register.
         * @param sp initial value of the stack pointer.
         * @param upload should we upload the code before running it.
         * @param args set registers r0...rn before running code
         *
         * @returns A promise for the value of r0 on completion of the function call.
         */
        runCode(code: Uint32Array, address: number, pc: number, lr: number, sp: number, upload: boolean, ...args: number[]): Promise<number>;
        /**
         * Spin until the chip has halted.
         */
        waitForHalt(timeout?: number): Promise<void>;
        prepareCommand(): PreparedCortexMCommand;
        private softwareReset();
    }
    /**
 * # Cortex M: Prepared Command
 *
 * Allows batching of Cortex M-related commands, such as writing to a register,
 * halting and resuming the core.
 *
 * ## Example
 *
 * When preparing the sequence of commands, we can use the same API to prepare
 * a command as we would to execute them immediately.
 *
 * ```typescript
 * // Note that only the .go method is asynchronous.
 *
 * const prep = core.prepareCommand();
 * prep.writeCoreRegister(CortexReg.R0, 0x1000);
 * prep.writeCoreRegister(CortexReg.R1, 0x0);
 * prep.writeCoreRegister(CortexReg.PC, 0x2000000);
 * prep.resume();
 * ```
 *
 * We can then execute them as efficiently as possible by combining them together
 * and executing them like so.
 *
 * ```typescript
 * await prep.go();
 * ```
 *
 * The code above is equivalent to the following _non-prepared_ command:
 *
 * ```typescript
 * await core.writeCoreRegister(CortexReg.R0, 0x1000);
 * await core.writeCoreRegister(CortexReg.R1, 0x0);
 * await core.writeCoreRegister(CortexReg.PC, 0x2000000);
 * await core.resume();
 * ```
 *
 * Since the batched version of this code avoids making three round-trips to the
 * target, we are able to significantly improve performance. This is especially
 * noticable when uploading a binary to flash memory, where are large number of
 * repetetive commands are being used.
 *
 * ## Explanation
 *
 * For a detailed explanation of why prepared commands are used in DAP.js, see the
 * documentation for `PreparedDapCommand`.
 */
    export class PreparedCortexMCommand {
        private cmd;
        constructor(dap: DAP);
        /**
         * Schedule a 32-bit integer to be written to a core register.
         *
         * @param no Core register to be written.
         * @param val Value to write.
         */
        writeCoreRegister(no: CortexReg, val: number): void;
        /**
         * Schedule a halt command to be written to the CPU.
         */
        halt(): void;
        /**
         * Schedule a resume command to be written to the CPU.
         */
        resume(): void;
        /**
         * Execute all scheduled commands.
         */
        go(): Promise<void>;
    }
    export const enum CortexReg {
        R0 = 0,
        R1 = 1,
        R2 = 2,
        R3 = 3,
        R4 = 4,
        R5 = 5,
        R6 = 6,
        R7 = 7,
        R8 = 8,
        R9 = 9,
        R10 = 10,
        R11 = 11,
        R12 = 12,
        SP = 13,
        LR = 14,
        PC = 15,
        XPSR = 16,
        MSP = 17,
        PSP = 18,
        PRIMASK = 20,
        CONTROL = 20,
    }
    export const enum CoreState {
        TARGET_RESET = 0,
        TARGET_LOCKUP = 1,
        TARGET_SLEEPING = 2,
        TARGET_HALTED = 3,
        TARGET_RUNNING = 4,
    }
    /**
     * # Debug Interface
     *
     * Keeps track of breakpoints set on the target, as well as deciding whether to
     * use a hardware breakpoint or a software breakpoint.
     *
     * ## Usage
     *
     * ```typescript
     * const dbg = core.debug;
     *
     * await dbg.setBreakpoint(0x123456);
     *
     * // resume the core and wait for the breakpoint
     * await core.resume();
     * await core.waitForHalt();
     *
     * // step forward one instruction
     * await dbg.step();
     *
     * // remove the breakpoint
     * await dbg.deleteBreakpoint(0x123456);
     * ```
     */
    export class Debug {
        private core;
        private breakpoints;
        private availableHWBreakpoints;
        private totalHWBreakpoints;
        private enabled;
        constructor(core: CortexM);
        init(): Promise<void>;
        /**
         * Enable debugging on the target CPU
         */
        enable(): Promise<void>;
        /**
         * Set breakpoints at specified memory addresses.
         *
         * @param addrs An array of memory addresses at which to set breakpoints.
         */
        setBreakpoint(addr: number): Promise<void>;
        deleteBreakpoint(addr: number): Promise<void>;
        /**
         * Step the processor forward by one instruction.
         */
        step(): Promise<void>;
        /**
         * Set up (and disable) the Flash Patch & Breakpoint unit. It will be enabled when
         * the first breakpoint is set.
         *
         * Also reads the number of available hardware breakpoints.
         */
        private setupFpb();
        /**
         * Enable or disable the Flash Patch and Breakpoint unit (FPB).
         *
         * @param enabled
         */
        private setFpbEnabled(enabled?);
    }
 }
@@ -1,8 +1,366 @@
 /// <reference path="../node_modules/pxt-core/built/pxteditor.d.ts" />
 interface Math {
    imul(x: number, y: number): number;
 }
 namespace pxt.editor {
-    initExtensionsAsync = function(opts: pxt.editor.ExtensionOptions): Promise<pxt.editor.ExtensionResult> {
+    import UF2 = pxtc.UF2;
    const pageSize = 1024;
    const numPages = 256;
    function murmur3_core(data: Uint8Array) {
        let h0 = 0x2F9BE6CC;
        let h1 = 0x1EC3A6C8;
        for (let i = 0; i < data.length; i += 4) {
            let k = HF2.read32(data, i) >>> 0
            k = Math.imul(k, 0xcc9e2d51);
            k = (k << 15) | (k >>> 17);
            k = Math.imul(k, 0x1b873593);
            h0 ^= k;
            h1 ^= k;
            h0 = (h0 << 13) | (h0 >>> 19);
            h1 = (h1 << 13) | (h1 >>> 19);
            h0 = (Math.imul(h0, 5) + 0xe6546b64) >>> 0;
            h1 = (Math.imul(h1, 5) + 0xe6546b64) >>> 0;
        }
        return [h0, h1]
    }
    class DAPWrapper {
        cortexM: DapJS.CortexM
        constructor(h: HF2.PacketIO) {
            let pbuf = new U.PromiseBuffer<Uint8Array>()
            let sendMany = (cmds: Uint8Array[]) => {
                return h.talksAsync(cmds.map(c => ({ cmd: 0, data: c })))
            }
            if (!h.talksAsync)
                sendMany = null
            let dev = new DapJS.DAP({
                write: writeAsync,
                close: closeAsync,
                read: readAsync,
                sendMany: sendMany
            })
            this.cortexM = new DapJS.CortexM(dev)
            h.onData = buf => {
                pbuf.push(buf)
            }
            function writeAsync(data: ArrayBuffer) {
                h.sendPacketAsync(new Uint8Array(data))
                return Promise.resolve()
            }
            function readAsync() {
                return pbuf.shiftAsync()
            }
            function closeAsync() {
                return h.disconnectAsync()
            }
        }
        reconnectAsync(first: boolean) {
            return this.cortexM.init()
        }
    }
    function dapAsync() {
        return pxt.HF2.mkPacketIOAsync()
            .then(h => {
                let w = new DAPWrapper(h)
                return w.reconnectAsync(true)
                    .then(() => w)
            })
    }
    let noHID = false
    let initPromise: Promise<DAPWrapper>
    function initAsync() {
        if (initPromise)
            return initPromise
        let canHID = false
        if (U.isNodeJS) {
            canHID = true
        } else {
            const forceHexDownload = /forceHexDownload/i.test(window.location.href);
            if (Cloud.isLocalHost() && Cloud.localToken && !forceHexDownload)
                canHID = true
        }
        if (noHID)
            canHID = false
        if (canHID) {
            initPromise = dapAsync()
                .catch(err => {
                    initPromise = null
                    noHID = true
                    return Promise.reject(err)
                })
        } else {
            noHID = true
            initPromise = Promise.reject(new Error("no HID"))
        }
        return initPromise
    }
    function pageAlignBlocks(blocks: UF2.Block[], pageSize: number) {
        U.assert(pageSize % 256 == 0)
        let res: UF2.Block[] = []
        for (let i = 0; i < blocks.length;) {
            let b0 = blocks[i]
            let newbuf = new Uint8Array(pageSize)
            let startPad = b0.targetAddr & (pageSize - 1)
            let newAddr = b0.targetAddr - startPad
            for (; i < blocks.length; ++i) {
                let b = blocks[i]
                if (b.targetAddr + b.payloadSize > newAddr + pageSize)
                    break
                U.memcpy(newbuf, b.targetAddr - newAddr, b.data, 0, b.payloadSize)
            }
            let bb = U.flatClone(b0)
            bb.data = newbuf
            bb.targetAddr = newAddr
            bb.payloadSize = pageSize
            res.push(bb)
        }
        return res
    }
    const flashPageBINquick = new Uint32Array([
        0xbe00be00, // bkpt - LR is set to this
        0x2480b5f0, 0x00e42300, 0x58cd58c2, 0xd10342aa, 0x42a33304, 0xbdf0d1f8,
        0x4b162502, 0x509d4a16, 0x2d00591d, 0x24a1d0fc, 0x511800e4, 0x3cff3c09,
        0x591e0025, 0xd0fc2e00, 0x509c2400, 0x2c00595c, 0x2401d0fc, 0x509c2580,
        0x595c00ed, 0xd0fc2c00, 0x00ed2580, 0x002e2400, 0x5107590f, 0x2f00595f,
        0x3404d0fc, 0xd1f742ac, 0x50992100, 0x2a00599a, 0xe7d0d0fc, 0x4001e000,
        0x00000504,
    ])
    // doesn't check if data is already there - for timing
    const flashPageBIN = new Uint32Array([
        0xbe00be00, // bkpt - LR is set to this
        0x2402b5f0, 0x4a174b16, 0x2480509c, 0x002500e4, 0x2e00591e, 0x24a1d0fc,
        0x511800e4, 0x2c00595c, 0x2400d0fc, 0x2480509c, 0x002500e4, 0x2e00591e,
        0x2401d0fc, 0x595c509c, 0xd0fc2c00, 0x00ed2580, 0x002e2400, 0x5107590f,
        0x2f00595f, 0x3404d0fc, 0xd1f742ac, 0x50992100, 0x2a00599a, 0xbdf0d0fc,
        0x4001e000, 0x00000504,
    ])
    // void computeHashes(uint32_t *dst, uint8_t *ptr, uint32_t pageSize, uint32_t numPages)
    const computeChecksums2 = new Uint32Array([
        0x4c27b5f0, 0x44a52680, 0x22009201, 0x91004f25, 0x00769303, 0x24080013,
        0x25010019, 0x40eb4029, 0xd0002900, 0x3c01407b, 0xd1f52c00, 0x468c0091,
        0xa9044665, 0x506b3201, 0xd1eb42b2, 0x089b9b01, 0x23139302, 0x9b03469c,
        0xd104429c, 0x2000be2a, 0x449d4b15, 0x9f00bdf0, 0x4d149e02, 0x49154a14,
        0x3e01cf08, 0x2111434b, 0x491341cb, 0x405a434b, 0x4663405d, 0x230541da,
        0x4b10435a, 0x466318d2, 0x230541dd, 0x4b0d435d, 0x2e0018ed, 0x6002d1e7,
        0x9a009b01, 0x18d36045, 0x93003008, 0xe7d23401, 0xfffffbec, 0xedb88320,
        0x00000414, 0x1ec3a6c8, 0x2f9be6cc, 0xcc9e2d51, 0x1b873593, 0xe6546b64,
    ])
    let startTime = 0
    function log(msg: string) {
        let now = Date.now()
        if (!startTime) startTime = now
        now -= startTime
        let ts = ("00000" + now).slice(-5)
        pxt.log(`HID ${ts}: ${msg}`)
    }
    const membase = 0x20000000
    const loadAddr = membase
    const dataAddr = 0x20002000
    const stackAddr = 0x20001000
    export const bufferConcat = (bufs: Uint8Array[]) => {
        let len = 0;
        for (const b of bufs) {
            len += b.length;
        }
        const r = new Uint8Array(len);
        len = 0;
        for (const b of bufs) {
            r.set(b, len);
            len += b.length;
        }
        return r;
    };
    function getFlashChecksumsAsync(wrap: DAPWrapper) {
        log("getting existing flash checksums")
        let pages = numPages
        return wrap.cortexM.runCode(computeChecksums2, loadAddr, loadAddr + 1, 0xffffffff, stackAddr, true,
            dataAddr, 0, pageSize, pages)
            .then(() => wrap.cortexM.memory.readBlock(dataAddr, pages * 2, pageSize))
    }
    function onlyChanged(blocks: UF2.Block[], checksums: Uint8Array) {
        return blocks.filter(b => {
            let idx = b.targetAddr / pageSize
            U.assert((idx | 0) == idx)
            U.assert(b.data.length == pageSize)
            if (idx * 8 + 8 > checksums.length)
                return true // out of range?
            let c0 = HF2.read32(checksums, idx * 8)
            let c1 = HF2.read32(checksums, idx * 8 + 4)
            let ch = murmur3_core(b.data)
            if (c0 == ch[0] && c1 == ch[1])
                return false
            return true
        })
    }
    export function deployCoreAsync(resp: pxtc.CompileResult, isCli = false): Promise<void> {
        let saveHexAsync = () => {
            if (isCli) {
                return Promise.resolve()
            } else {
                return pxt.commands.saveOnlyAsync(resp)
            }
        }
        startTime = 0
        if (noHID) return saveHexAsync()
        let wrap: DAPWrapper
        log("init")
        let logV = (msg: string) => { }
        //let logV = log
        const runFlash = (b: UF2.Block, dataAddr: number) => {
            const cmd = wrap.cortexM.prepareCommand();
            cmd.halt();
            cmd.writeCoreRegister(DapJS.CortexReg.PC, loadAddr + 4 + 1);
            cmd.writeCoreRegister(DapJS.CortexReg.LR, loadAddr + 1);
            cmd.writeCoreRegister(DapJS.CortexReg.SP, stackAddr);
            cmd.writeCoreRegister(0, b.targetAddr);
            cmd.writeCoreRegister(1, dataAddr);
            return Promise.resolve()
                .then(() => {
                    logV("setregs")
                    return cmd.go()
                })
                .then(() => {
                    logV("dbg en")
                    // starts the program
                    return wrap.cortexM.debug.enable()
                })
        }
        let checksums: Uint8Array
        return initAsync()
            .then(w => {
                wrap = w
                log("reset")
                return wrap.cortexM.reset(true)
            })
            .then(() => getFlashChecksumsAsync(wrap))
            .then(buf => {
                checksums = buf
                log("write code")
                return wrap.cortexM.memory.writeBlock(loadAddr, flashPageBIN)
            })
            .then(() => {
                log("convert")
                // TODO this is seriously inefficient (130ms on a fast machine)
                let uf2 = UF2.newBlockFile()
                UF2.writeHex(uf2, resp.outfiles[pxtc.BINARY_HEX].split(/\r?\n/))
                let bytes = U.stringToUint8Array(UF2.serializeFile(uf2))
                let parsed = UF2.parseFile(bytes)
                let aligned = pageAlignBlocks(parsed, pageSize)
                log(`initial: ${aligned.length} pages`)
                aligned = onlyChanged(aligned, checksums)
                log(`incremental: ${aligned.length} pages`)
                return Promise.mapSeries(U.range(aligned.length),
                    i => {
                        let b = aligned[i]
                        if (b.targetAddr >= 0x10000000)
                            return Promise.resolve()
                        logV("about to write at 0x" + b.targetAddr.toString(16))
                        let writeBl = Promise.resolve()
                        let thisAddr = (i & 1) ? dataAddr : dataAddr + pageSize
                        let nextAddr = (i & 1) ? dataAddr + pageSize : dataAddr
                        if (i == 0) {
                            let u32data = new Uint32Array(b.data.length / 4)
                            for (let i = 0; i < b.data.length; i += 4)
                                u32data[i >> 2] = HF2.read32(b.data, i)
                            writeBl = wrap.cortexM.memory.writeBlock(thisAddr, u32data)
                        }
                        return writeBl
                            .then(() => runFlash(b, thisAddr))
                            .then(() => {
                                let next = aligned[i + 1]
                                if (!next)
                                    return Promise.resolve()
                                logV("write next")
                                let buf = new Uint32Array(next.data.buffer)
                                return wrap.cortexM.memory.writeBlock(nextAddr, buf)
                            })
                            .then(() => {
                                logV("wait")
                                return wrap.cortexM.waitForHalt(500)
                            })
                            .then(() => {
                                logV("done block")
                            })
                    })
                    .then(() => {
                        log("flash done")
                        return wrap.cortexM.reset(false)
                    })
            })
            .catch(e => {
                // if we failed to initalize, retry
                if (noHID)
                    return saveHexAsync()
                else
                    return Promise.reject(e)
            })
    }
    initExtensionsAsync = function (opts: pxt.editor.ExtensionOptions): Promise<pxt.editor.ExtensionResult> {
        pxt.debug('loading microbit target extensions...')
        if (!Math.imul)
            Math.imul = (a, b) => {
                var ah = (a >>> 16) & 0xffff;
                var al = a & 0xffff;
                var bh = (b >>> 16) & 0xffff;
                var bl = b & 0xffff;
                // the shift by 0 fixes the sign on the high part
                // the final |0 converts the unsigned value into a signed value
                return ((al * bl) + (((ah * bl + al * bh) << 16) >>> 0) | 0);
            };
        const res: pxt.editor.ExtensionResult = {
            hexFileImporters: [{
                id: "blockly",
@@ -24,6 +382,8 @@ namespace pxt.editor {
                            .then(text => project.overrideTypescriptFile(text))
                }]
        };
        pxt.commands.deployCoreAsync = deployCoreAsync;
        return Promise.resolve<pxt.editor.ExtensionResult>(res);
    }
 }
@@ -8,5 +8,6 @@
        "rootDir": ".",
        "newLine": "LF",
        "sourceMap": false
-    }
+    },
    "prepend": ["../external/dapjs.js"]
 }
@@ -0,0 +1,6 @@
 {
  "build": {
    "target": "bbc-microbit-classic-gcc,*",
    "targetSetExplicitly": true
  }
 }
@@ -0,0 +1,5 @@
 #!/bin/sh
 yotta build
 arm-none-eabi-objdump -d `find -name main.c.o` > disasm
 node genapplet.js disasm Reset_Handler
 rm disasm
@@ -0,0 +1,48 @@
 let fs = require("fs")
 let s = fs.readFileSync(process.argv[2], "utf8")
 let infun = false
 let words = []
 for (let l of s.split(/\n/)) {
    let m = /^00000000 <(.*)>:/.exec(l)
    if (m && m[1] == process.argv[3]) infun = true
    if (/^Disassembly/.test(l)) infun = false
    if (!infun) continue
    m = /^\s*[0-9a-f]+:\s+([0-9a-f]+)( ([0-9a-f]{4}))?\s+/.exec(l)
    if (m) {
        let n = m[1]
        words.push(n)
        if (m[3])
            words.push(m[3])
        if (n.length == 4 || n.length == 8) {
            // ok
        } else {
            throw new Error()
        }
    }
 }
 let ww = []
 let pref = ""
 for (let w of words) {
    if (w.length == 8) {
        if (pref) throw new Error()
        ww.push("0x" + w)
    } else {
        if (pref) {
            ww.push("0x" + w + pref)
            pref = ""
        } else {
            pref = w
        }
    }
 }
 words = ww
 let r = ""
 for (let i = 0; i < words.length; i++) {
    if (i % 6 == 0) r += "\n"
    r += words[i] + ", "
 }
 console.log(r)
@@ -0,0 +1,9 @@
 {
  "name": "sha",
  "version": "0.0.0",
  "keywords": [],
  "author": "",
  "license": "MIT",
  "dependencies": {},
  "bin": "./source"
 }
@@ -0,0 +1,240 @@
 #include <stdint.h>
 static const uint32_t sha256_k[] = {
    0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5, 0x3956c25b, 0x59f111f1,
    0x923f82a4, 0xab1c5ed5, 0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3,
    0x72be5d74, 0x80deb1fe, 0x9bdc06a7, 0xc19bf174, 0xe49b69c1, 0xefbe4786,
    0x0fc19dc6, 0x240ca1cc, 0x2de92c6f, 0x4a7484aa, 0x5cb0a9dc, 0x76f988da,
    0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7, 0xc6e00bf3, 0xd5a79147,
    0x06ca6351, 0x14292967, 0x27b70a85, 0x2e1b2138, 0x4d2c6dfc, 0x53380d13,
    0x650a7354, 0x766a0abb, 0x81c2c92e, 0x92722c85, 0xa2bfe8a1, 0xa81a664b,
    0xc24b8b70, 0xc76c51a3, 0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070,
    0x19a4c116, 0x1e376c08, 0x2748774c, 0x34b0bcb5, 0x391c0cb3, 0x4ed8aa4a,
    0x5b9cca4f, 0x682e6ff3, 0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208,
    0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2};
 #define rotr(v, b) (((uint32_t)v >> b) | (v << (32 - b)))
 static inline void sha256round(uint32_t *hs, uint32_t *w) {
  for (int i = 16; i < 64; ++i) {
    uint32_t s0 = rotr(w[i - 15], 7) ^ rotr(w[i - 15], 18) ^ (w[i - 15] >> 3);
    uint32_t s1 = rotr(w[i - 2], 17) ^ rotr(w[i - 2], 19) ^ (w[i - 2] >> 10);
    w[i] = (w[i - 16] + s0 + w[i - 7] + s1) | 0;
  }
  uint32_t a = hs[0];
  uint32_t b = hs[1];
  uint32_t c = hs[2];
  uint32_t d = hs[3];
  uint32_t e = hs[4];
  uint32_t f = hs[5];
  uint32_t g = hs[6];
  uint32_t h = hs[7];
  for (int i = 0; i < 64; ++i) {
    uint32_t s1 = rotr(e, 6) ^ rotr(e, 11) ^ rotr(e, 25);
    uint32_t ch = (e & f) ^ (~e & g);
    uint32_t temp1 = (h + s1 + ch + sha256_k[i] + w[i]);
    uint32_t s0 = rotr(a, 2) ^ rotr(a, 13) ^ rotr(a, 22);
    uint32_t maj = (a & b) ^ (a & c) ^ (b & c);
    uint32_t temp2 = (s0 + maj);
    h = g;
    g = f;
    f = e;
    e = (d + temp1);
    d = c;
    c = b;
    b = a;
    a = (temp1 + temp2);
  }
  hs[0] += a;
  hs[1] += b;
  hs[2] += c;
  hs[3] += d;
  hs[4] += e;
  hs[5] += f;
  hs[6] += g;
  hs[7] += h;
 }
 #define INLINE __attribute__((always_inline)) static inline
 INLINE void sha256block(uint8_t *buf, uint32_t len, uint32_t *dst) {
  uint32_t hs[] = {0x6a09e667, 0xbb67ae85, 0x3c6ef372, 0xa54ff53a,
                   0x510e527f, 0x9b05688c, 0x1f83d9ab, 0x5be0cd19};
  uint32_t w[64];
  for (uint32_t i = 0; i < len; i += 64) {
    for (uint32_t j = 0; j < 16; j++) {
      uint32_t off = (j << 2) + i;
      w[j] = (buf[off] << 24) | (buf[off + 1] << 16) | (buf[off + 2] << 8) |
             buf[off + 3];
    }
    sha256round(hs, w);
  }
  dst[0] = hs[0];
  dst[1] = hs[1];
 }
 #define POLYNOMIAL 0xEDB88320
 INLINE void makeCRC32tab(uint32_t *table) {
  for (uint32_t b = 0; b < 256; ++b) {
    uint32_t r = b;
    for (uint32_t j = 0; j < 8; ++j) {
      if (r & 1)
        r = (r >> 1) ^ POLYNOMIAL;
      else
        r = (r >> 1);
    }
    table[b] = r;
  }
 }
 INLINE uint32_t crc(const uint8_t *p, uint32_t len, uint32_t *crcTable) {
  uint32_t crc = ~0U;
  for (uint32_t i = 0; i < len; ++i)
    crc = crcTable[*p++ ^ (crc & 0xff)] ^ (crc >> 8);
  return (~crc);
 }
 INLINE uint32_t murmur3_core(const uint8_t *data, uint32_t len) {
  uint32_t h = 0x2F9BE6CC;
  const uint32_t *data32 = (const uint32_t *)data;
  uint32_t i = len >> 2;
  do {
    uint32_t k = *data32++;
    k *= 0xcc9e2d51;
    k = (k << 15) | (k >> 17);
    k *= 0x1b873593;
    h ^= k;
    h = (h << 13) | (h >> 19);
    h = (h * 5) + 0xe6546b64;
  } while (--i);
  return h;
 }
 INLINE void murmur3_core_2(const uint8_t *data, uint32_t len, uint32_t *dst) {
  // compute two hashes with different seeds in parallel, hopefully reducing
  // collisions
  uint32_t h0 = 0x2F9BE6CC;
  uint32_t h1 = 0x1EC3A6C8;
  const uint32_t *data32 = (const uint32_t *)data;
  uint32_t i = len >> 2;
  do {
    uint32_t k = *data32++;
    k *= 0xcc9e2d51;
    k = (k << 15) | (k >> 17);
    k *= 0x1b873593;
    h0 ^= k;
    h1 ^= k;
    h0 = (h0 << 13) | (h0 >> 19);
    h1 = (h1 << 13) | (h1 >> 19);
    h0 = (h0 * 5) + 0xe6546b64;
    h1 = (h1 * 5) + 0xe6546b64;
  } while (--i);
  dst[0] = h0;
  dst[1] = h1;
 }
 int Reset_Handler(uint32_t *dst, uint8_t *ptr, uint32_t pageSize,
                  uint32_t numPages) {
  uint32_t crcTable[256];
  makeCRC32tab(crcTable);
  for (uint32_t i = 0; i < numPages; ++i) {
 #if 0
    sha256block(ptr, pageSize, dst);
 #elif 0
    dst[0] = crc(ptr, pageSize, crcTable);
    dst[1] = murmur3_core(ptr, pageSize);
 #else
    murmur3_core_2(ptr, pageSize, dst);
 #endif
    dst += 2;
    ptr += pageSize;
  }
 #ifdef __arm__
  __asm__("bkpt 42");
 #endif
  return 0;
 }
 #if 0
 #define PAGE_SIZE 0x400
 #define SIZE_IN_WORDS (PAGE_SIZE / 4)
 #define setConfig(v)                                                           \
  do {                                                                         \
    NRF_NVMC->CONFIG = v;                                                      \
    while (NRF_NVMC->READY == NVMC_READY_READY_Busy)                           \
      ;                                                                        \
  } while (0)
 void overwriteFlashPage(uint32_t *to, uint32_t *from) {
  int same = 1;
  for (int i = 0; i <= (SIZE_IN_WORDS - 1); i++) {
    if (to[i] != from[i]) {
      same = 0;
      break;
    }
  }
  if (same)
    return;
  // Turn on flash erase enable and wait until the NVMC is ready:
  setConfig(NVMC_CONFIG_WEN_Een << NVMC_CONFIG_WEN_Pos);
  // Erase page:
  NRF_NVMC->ERASEPAGE = (uint32_t)to;
  while (NRF_NVMC->READY == NVMC_READY_READY_Busy)
    ;
  // Turn off flash erase enable and wait until the NVMC is ready:
  setConfig(NVMC_CONFIG_WEN_Ren << NVMC_CONFIG_WEN_Pos);
  // Turn on flash write enable and wait until the NVMC is ready:
  setConfig(NVMC_CONFIG_WEN_Wen << NVMC_CONFIG_WEN_Pos);
  for (int i = 0; i <= (SIZE_IN_WORDS - 1); i++) {
    *(to + i) = *(from + i);
    while (NRF_NVMC->READY == NVMC_READY_READY_Busy)
      ;
  }
  // Turn off flash write enable and wait until the NVMC is ready:
  setConfig(NVMC_CONFIG_WEN_Ren << NVMC_CONFIG_WEN_Pos);
 }
 #endif
 #ifndef __arm__
 #define PS 1024
 #define NP 10
 #include <stdio.h>
 #include <string.h>
 int main() {
  uint8_t buf[NP * PS];
  uint32_t sums[NP * 2];
  memset(buf, 0, sizeof(buf));
  for (int i = 0; i < PS; ++i)
    buf[i] = i;
  for (int i = 0; i < PS; ++i)
    buf[i + PS] = 108;
  Reset_Handler(sums, buf, PS, NP);
  for (int i = 0; i < NP; ++i) {
    printf("%08x %08x\n", sums[i * 2], sums[i * 2 + 1]);
  }
  return 0;
 }
 #endif
@@ -28,8 +28,7 @@
        "driveName": "MICROBIT",
        "hexMimeType": "application/x-microbit-hex",
        "openocdScript": "source [find interface/cmsis-dap.cfg]; source [find target/nrf51.cfg]",
-        "upgrades": [
+        "upgrades": [{
            {
                "type": "package",
                "map": {
                    "microbit": "core",
@@ -194,11 +193,12 @@
        "serviceId": "microbit"
    },
    "serial": {
        "productFilter": "0x0204",
        "vendorFilter": "0x0d28",
        "nameFilter": "^mbed Serial Port",
        "log": true,
-        "chromeExtension": "hjcflblhjoglmjjkecamiegdigfkgeni"
+        "chromeExtension": "hjcflblhjoglmjjkecamiegdigfkgeni",
        "vendorId": "0x0d28",
        "productId": "0x0204",
        "rawHID": true
    },
    "appTheme": {
        "accentColor": "#5C005C",
@@ -234,8 +234,7 @@
        "appStoreID": "1092687276",
        "mobileSafariDownloadProtocol": "microbithex://?data",
        "extendEditor": true,
-        "docMenu": [
+        "docMenu": [{
            {
                "name": "Support",
                "path": "https://support.microbit.org/"
            },
@@ -271,8 +270,7 @@
        ],
        "hasReferenceDocs": true,
        "usbDocs": "/device/usb",
-        "usbHelp": [
+        "usbHelp": [{
            {
                "name": "connection",
                "os": "*",
                "browser": "*",