الاثنين، 15 مايو 2023
Show HN: Capillaries: Distributed data processing with Go and Cassandra https://ift.tt/3ehMfZL
Show HN: Capillaries: Distributed data processing with Go and Cassandra I started thinking about this approach after working on a large-scale project for a major financial company where our group developed a distributed in-house data processing solution. On a regular basis, it ingested a few gigabytes of financial data and, within a tight SLA time limit, produced a lot of enriched/aggregated/validated data for a number of customers. Sometimes, source data had errors, so operators with domain knowledge had to verify data validity at some checkpoints, immediately make corrections, and re-run parts of the workflow manually. The solution involved complex web service orchestration, custom database and was very demanding on the infrastructure availability. Capillaries is a built from scratch, open-source Go solution that does just that: ingests data and applies user-defined transforms - Go one-liner expressions, Python formulas, joins, aggregations, denormalization - using Cassandra for intermediate data storage and RabbitMQ for task scheduling. End users just have to provide: - source data in CSV files; - Capillaries script (JSON file) that defines the workflow and the transforms; - Python code that performs complex calculations (only if needed). The whole data processing pipeline can be split into separate runs that can be started independently and re-run by the user if needed. The goal is to build a platform that is tolerant to database and processing node failures, and allows users to focus on data transform logic and data quality control. “Getting started” Docker-based demo calculates ARK funds performance, using EOD holdings and transactions data acquired from public sources. There are also integration tests that use non-financial data. There is a test deploy tool that uses Openstack API for provisioning in the cloud. https://capillaries.io May 16, 2023 at 03:13AM
الاشتراك في:
تعليقات الرسالة (Atom)
������ �����
خدمات طبيه https://www.cut-titles.com/Y4ZR
-
Show HN: Tailwind CSS editor for busy developers https://tailwind.build/ November 4, 2019 at 11:51AM
-
Medical News Today: New therapeutic approach may improve outcomes in sepsis and stroke An innovative method that researchers have tested in ...
-
https://ift.tt/34ZPvVc via /r/aww https://ift.tt/2RyQYie
-
https://ift.tt/2OXlQoN via /r/aww https://ift.tt/33Uyrgk
-
Show HN: Show descendant count on Hacker News 'more' links https://ift.tt/3v8gl8i June 9, 2021 at 03:02AM
ليست هناك تعليقات:
إرسال تعليق