Python: high performance backend

# More Efficient Python Implementation

Current `flatdata-py` implementation is pure python. So far we have used it only for processing smaller datasets and for inspection/debugging. It was noticed that on large datasets it performs quite slowly. It would be useful to have an implementation with performance not _too_ far from C++ one. In order to achieve that, we could do following:

- Benchmark two implementations on the same data, to know the gap, monitor the benchmarks in CI. #9
- Optimize pure-python implementation.
- Introduce parallel processing in pure python implementation (or ease integration with a library that would do it for us, like `dask`).
- As an alternative approach, create flatdata-py-ext implementation which would build and use binary extensions to improve performance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python: high performance backend #8

More Efficient Python Implementation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Python: high performance backend #8

Description

More Efficient Python Implementation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions