Frank Mitchell

Posted: 2023-04-12
Last Modified: 2023-07-21
Word Count: 853
Tags: c-programming lua programming python ruby

Table of Contents

This project will create a C library to parse and emit JSON. It will then provide wrappers for various languages including Lua, Python, and Ruby.


Since there are so many C JSON parsers already, this one will try to do things a little differently:

  1. It will offer a “pull parser” interface modeled after JSONPP.
  2. The implementation will attempt to avoid allocation and leaks in general.


#include <stdbool.h>
#include <stdint.h>

typedef enum {
    START_STREAM,   /* initial state before parsing */
    START_ARRAY,    /* read `[` */
    END_ARRAY,      /* read `]` */
    START_OBJECT,   /* read `{` */
    END_OBJECT,     /* read `}` */
    KEY_NAME,       /* read key for Object */
    VALUE_NULL,     /* read `null` */
    VALUE_TRUE,     /* read `true` */
    VALUE_FALSE,    /* read `false` */
    VALUE_NUMBER,   /* read JSON Number */
    VALUE_STRING,   /* read JSON String */
    END_STREAM      /* reached final state */
} Json_Event;

typedef struct _Json_Pull_Parser Json_Pull_Parser;

 * A convenience type for a single byte character.
 * Per the JSON spec, the parser can only parse ASCII characters.
typedef char    byte_t;

 * A convernience type for a byte in a UTF-8 sequence.
typedef uint8_t utf8_t;

 * A callback to fetch more characters from a file, file descriptor, socket,
 * or anything else.
 * `data` is the same pointer passed into `Json_Pull_Parser_new()`.
 * When the function returns, the client should return new bytes to parse, 
 * and set `*sizptr` to the number of bytes pointed to.
 * The parser checks that all characters are in the ASCII range.
 * Buffer management is the client routine's responsibility, perhaps in
 * conjunction with the `data` ptr.
typedef const byte_t* (*Json_Reader)(void* data, size_t *sizptr);

 * Create a new parser at `*pptr`, reading in characters with `r` and `d`.
void Json_Pull_Parser_new(Json_Pull_Parser* *pptr, Json_Reader r, void* d);

 * Advance to the next significant token in the input.
void Json_Pull_Parser_do_next(Json_Pull_Parser* p);

 * The type of the last event parsed.
Json_Event Json_Pull_Parser_event(Json_Pull_Parser* p);

 * Whether the parser is currently processing a JSON Array.
bool Json_Pull_Parser_in_array(Json_Pull_Parser* p);

 * Whether the parser is currently processing a JSON Object.
bool Json_Pull_Parser_in_object(Json_Pull_Parser* p);

 * The number value of the last event parsed, if that event was VALUE_NUMBER,
 * or NULL otherwise.
 * Clients should copy the value before the next call to 
 * `Json_Pull_Parser_do_next()` or this value will be overwritten.
double_t*  Json_Pull_Parser_number(Json_Pull_Parser* p);

 * The string value of the last event parsed, or NULL if the last event
 * The result converts all Unicode and other escape sequences to UTF-8 bytes.
 * Clients should copy the string before the next call to 
 * `Json_Pull_Parser_do_next()` or this value will be overwritten.
const utf8_t* Json_Pull_Parser_string(Json_Pull_Parser* p);

 * Acquire another reference to this parser, so that a call to
 * `Json_Pull_Parser_release()` doesn't immediately destroy the parser.
Json_Pull_Parser* Json_Pull_Parser_retain(Json_Pull_Parser* p);

 * End parsing and release the memory held by this parser, if no other
 * client "retains" it.
void Json_Pull_Parser_release(Json_Pull_Parser* *pptr);


Wrappers will use native I/O to construct a tree of native equivalents to JSON Arrays and Objects.


The parser will create a tree of tables:

local jsonpp = require "jsonpp"

local result = jsonpp.parse [[
        {"quote": "THIS IS JSON!"}

print(result.quote)  -- => THIS IS JSON!

Or, if the programmer prefers:

local jsonpp = require "jsonpp"

-- `readerfcn` is a function that accepts an arbitrary argument
-- (e.g. 1) to represent state and returns a string and the next state
-- value on every invocation until the last, when it returns nil.

local input <const> = [[
        {"quote": "THIS IS JSON!"}

function readerfcn(state)
    if state > string.len(input) then
        return nil, nil
    local newstate = state + 4
    return string.sub(input, state, newstate-1), newstate

local parser = jsonpp.new_parser(readerfcn, 1)

for event, value in parser:iterator() do
    -- `value` is either a string, number, boolean, nil, or error message
    -- depending on the value of `event`
    if event == jsonpp.START_OBJECT then
        -- do something
    elseif event == jsonpp.END_OBJECT then
        -- do something else
        -- etc.

Or maybe instead of the for loop …

parser:handle_events {
    start_object = function (parser) do
            -- something useful
    end_object = function (parser) do
            -- something useful
    -- and so on.

Or if really going old-school:

local jsonpp = require "jsonpp"

local parser = jsonpp.new_parser([[{"quote": "THIS IS JSON!"}]])

local event, value;

event = parser:event()  -- => json.START_STREAM

-- ... and so forth ...


The parser will create a tree of dictionaries and lists.

import jsonpp

result = jsonpp.parse("""
        { "quote": "THIS IS JSON!" }

print result
# Should be `{'quote': 'THIS IS JSON!'}`

Most or all of the Lua alternatives will be available in their own Pythonic idiom.


The parser will create a tree of Arrays and Hashes.

require "jsonpp"

result = jsonpp.parse('{"quote": "THIS IS JSON!"}')

puts result.inspect
# Should be `{"quote"=>"THIS IS JSON!"}`

Most or all of the Lua alternatives will be available in accordance with the Ruby Way.