kwutil.util_json module¶
json utilities for debugging serializability and attempting to ensure it in some cases.
- kwutil.util_json.debug_json_unserializable(data, msg='')[source]¶
Raises an exception if the data is not serializable and prints information about it. This is a thin wrapper around
find_json_unserializable().
- kwutil.util_json.ensure_json_serializable(dict_, normalize_containers=False, verbose=0, unhandled_policy='keep')[source]¶
Attempt to convert common types (e.g. numpy) into something json compliant
Convert numpy and tuples into lists. Attempts to decode bytes as utf8, but will skip if this is not possible.
- Parameters:
dict_ (List | Dict) – A data structure nearly compatible with json. (todo: rename arg)
normalize_containers (bool) – if True, normalizes dict containers to be standard python structures. Defaults to False.
unhandled_policy (str) – What to do if there isn’t a straighforward way to convert to a serializable structure. Can be “keep”, “error” or “stringify”.
- Returns:
normalized data structure that should be entirely json serializable.
- Return type:
Dict | List
Note
This was ported from kwcoco.util
Example
>>> from kwutil.util_json import * # NOQA >>> assert ensure_json_serializable([]) == [] >>> assert ensure_json_serializable({}) == {} >>> data = [pathlib.Path('.')] >>> assert ensure_json_serializable(data) == ['.'] >>> assert ensure_json_serializable(data) != data
Example
>>> # by default non-serializable objects are kept-as-is >>> data = [[], {}, object(), (1, 2)] >>> ensure_json_serializable(data) >>> ensure_json_serializable(data, unhandled_policy='stringify') >>> #ensure_json_serializable(data, unhandled_policy='pickle') >>> import pytest >>> with pytest.raises(Exception): >>> ensure_json_serializable(data, unhandled_policy='error')
Example
>>> # xdoctest: +REQUIRES(module:numpy) >>> from kwutil.util_json import * # NOQA >>> data = ub.ddict(lambda: int) >>> data['foo'] = ub.ddict(lambda: int) >>> data['bar'] = np.array([1, 2, 3]) >>> data['foo']['a'] = 1 >>> data['foo']['b'] = (1, np.array([1, 2, 3]), {3: np.int32(3), 4: np.float16(1.0)}) >>> dict_ = data >>> print(ub.urepr(data, nl=-1)) >>> assert list(find_json_unserializable(data)) >>> result = ensure_json_serializable(data, normalize_containers=True) >>> print(ub.urepr(result, nl=-1)) >>> assert not list(find_json_unserializable(result)) >>> assert type(result) is dict
- kwutil.util_json.find_json_unserializable(data, quickcheck=False)[source]¶
Recurse through json datastructure and find any component that causes a serialization error. Record the location of these errors in the datastructure as we recurse through the call tree.
- Parameters:
data (object) – data that should be json serializable
quickcheck (bool) – if True, check the entire datastructure assuming its ok before doing the python-based recursive logic.
- Returns:
list of “bad part” dictionaries containing items
’value’ - the value that caused the serialization error
’loc’ - which contains a list of key/indexes that can be used to lookup the location of the unserializable value. If the “loc” is a list, then it indicates a rare case where a key in a dictionary is causing the serialization error.
- Return type:
List[Dict]
Note
This was ported from kwcoco.util
Example
>>> # xdoctest: +REQUIRES(module:numpy) >>> from kwutil.util_json import * # NOQA >>> part = ub.ddict(lambda: int) >>> part['foo'] = ub.ddict(lambda: int) >>> part['bar'] = np.array([1, 2, 3]) >>> part['foo']['a'] = 1 >>> # Create a dictionary with two unserializable parts >>> data = [1, 2, {'nest1': [2, part]}, {frozenset({'badkey'}): 3, 2: 4}] >>> parts = list(find_json_unserializable(data)) >>> print('parts = {}'.format(ub.urepr(parts, nl=1))) >>> # Check expected structure of bad parts >>> assert len(parts) == 2 >>> part = parts[1] >>> assert list(part['loc']) == [2, 'nest1', 1, 'bar'] >>> # We can use the "loc" to find the bad value >>> for part in parts: >>> # "loc" is a list of directions containing which keys/indexes >>> # to traverse at each descent into the data structure. >>> directions = part['loc'] >>> curr = data >>> special_flag = False >>> for key in directions: >>> if isinstance(key, list): >>> # special case for bad keys >>> special_flag = True >>> break >>> else: >>> # normal case for bad values >>> curr = curr[key] >>> if special_flag: >>> assert part['data'] in curr.keys() >>> assert part['data'] is key[1] >>> else: >>> assert part['data'] is curr
Example
>>> # xdoctest: +SKIP("TODO: circular ref detect algo is wrong, fix it") >>> from kwutil.util_json import * # NOQA >>> import pytest >>> # Test circular reference >>> data = [[], {'a': []}] >>> data[1]['a'].append(data) >>> with pytest.raises(ValueError, match="Circular reference detected at.*1, 'a', 1*"): ... parts = list(find_json_unserializable(data)) >>> # Should be ok here >>> shared_data = {'shared': 1} >>> data = [[shared_data], shared_data] >>> parts = list(find_json_unserializable(data))
- class kwutil.util_json.Json[source]¶
Bases:
objectSimilar to kwutil.Yaml, the Json class provides a set of helpers to make working with json easier.
Example
>>> from kwutil.util_json import Json >>> import ubelt as ub >>> unserializable_data = { >>> 'a': 'hello world', >>> 'b': ub.udict({'a': 3}), >>> 'c': ub.Path('a/path/object'), >>> } >>> data = Json.ensure_serializable(unserializable_data) >>> text1 = Json.dumps(data, backend='stdlib') >>> # Coerce is idempotent and resolves the input to nested Python >>> # structures. >>> resolved1 = Json.coerce(data) >>> resolved2 = Json.coerce(text1) >>> resolved3 = Json.coerce(resolved2) >>> assert resolved1 == resolved2 == resolved3 == data >>> # with stdlib >>> data2 = Json.loads(text1) >>> assert data2 == data >>> # with ujson >>> # xdoctest: +REQUIRES(module:ujson) >>> data2 = Json.loads(text1, backend='ujson') >>> assert data2 == data
- static dump(data, fp, backend='stdlib', **kwargs)[source]¶
Write json data to a file with a chosen backend.
- Parameters:
data (dict | list | int | float | str) – json serializable data.
fp (PathLike | IO) – Where to write the data
backend (str) – stdlib, ujson, or orjson
**kwargs – additional arguments to pass to the specific backend.
- static dumps(data, backend='stdlib', **kwargs)[source]¶
Convert json data to text with a chosen backend.
- Parameters:
data (dict | list | int | float | str) – json serializable data.
backend (str) – stdlib, ujson, or orjson
**kwargs – additional arguments to pass to the specific backend.
- classmethod coerce(data, backend='stdlib', path_policy='existing_file_with_extension')[source]¶
Example
>>> from kwutil.util_json import Json >>> import ubelt as ub >>> Json.coerce('[1, 2, 3]') [1, 2, 3] >>> fpath = ub.Path.appdir('kwutil/tests/util_json').ensuredir() / 'file.json' >>> fpath.write_text(Json.dumps([4, 5, 6])) >>> Json.coerce(fpath) [4, 5, 6] >>> Json.coerce(str(fpath)) [4, 5, 6] >>> dict(Json.coerce('{"a": "b", "c": "d"}')) {'a': 'b', 'c': 'd'} >>> Json.coerce(None) None
- classmethod find_unserializable(data, quickcheck=False)[source]¶
Example
>>> import kwutil >>> import ubelt as ub >>> data = { >>> 'a': 1, >>> 'b': 2, >>> 'c': ub.Path('/pathlib/object') >>> } >>> results = list(kwutil.Json.find_unserializable(data)) >>> print(f'results = {ub.urepr(results, nl=1)}') results = [ {'loc': ['c'], 'data': Path('/pathlib/object')}, ]
- classmethod ensure_serializable(dict_, normalize_containers=False, verbose=0, unhandled_policy='keep')[source]¶
Example
>>> import kwutil >>> import pathlib >>> data = { >>> 'a': 1, >>> 'b': 2, >>> 'c': pathlib.Path('/pathlib/object') >>> } >>> results = kwutil.Json.ensure_serializable(data) >>> print(f'results = {ub.urepr(results, nl=1)}') results = { 'a': 1, 'b': 2, 'c': '/pathlib/object', }
- classmethod debug_unserializable(data, msg='')[source]¶
Raises an exception if the data is not serializable and prints information about it. This is a thin wrapper around
Json.find_unserializable().Example
>>> import kwutil >>> import ubelt as ub >>> data = { >>> 'a': 1, >>> 'b': 2, >>> 'c': ub.Path('/pathlib/object') >>> } >>> try: >>> kwutil.Json.debug_unserializable(data, 'obj had non-json data at: ') >>> except Exception as ex: >>> print(f'Exception: {ex}') Exception: obj had non-json data at: [ {'loc': ['c'], 'data': Path('/pathlib/object')}, ]