الاثنين، 15 مايو 2023
Show HN: Capillaries: Distributed data processing with Go and Cassandra https://ift.tt/3ehMfZL
Show HN: Capillaries: Distributed data processing with Go and Cassandra I started thinking about this approach after working on a large-scale project for a major financial company where our group developed a distributed in-house data processing solution. On a regular basis, it ingested a few gigabytes of financial data and, within a tight SLA time limit, produced a lot of enriched/aggregated/validated data for a number of customers. Sometimes, source data had errors, so operators with domain knowledge had to verify data validity at some checkpoints, immediately make corrections, and re-run parts of the workflow manually. The solution involved complex web service orchestration, custom database and was very demanding on the infrastructure availability. Capillaries is a built from scratch, open-source Go solution that does just that: ingests data and applies user-defined transforms - Go one-liner expressions, Python formulas, joins, aggregations, denormalization - using Cassandra for intermediate data storage and RabbitMQ for task scheduling. End users just have to provide: - source data in CSV files; - Capillaries script (JSON file) that defines the workflow and the transforms; - Python code that performs complex calculations (only if needed). The whole data processing pipeline can be split into separate runs that can be started independently and re-run by the user if needed. The goal is to build a platform that is tolerant to database and processing node failures, and allows users to focus on data transform logic and data quality control. “Getting started” Docker-based demo calculates ARK funds performance, using EOD holdings and transactions data acquired from public sources. There are also integration tests that use non-financial data. There is a test deploy tool that uses Openstack API for provisioning in the cloud. https://capillaries.io May 16, 2023 at 03:13AM
الاشتراك في:
تعليقات الرسالة (Atom)
������ �����
خدمات طبيه https://www.cut-titles.com/Y4ZR
-
Show HN: Oldest Search – Search for the oldest result on internet https://ift.tt/edt6o5K May 11, 2022 at 01:21AM
-
From Market Street to Fisherman’s Wharf, Cable Cars Return By Lolita Sweet Starting today, August 2nd, San Francisco’s historic cable car...
ليست هناك تعليقات:
إرسال تعليق