I have a very large collection of JSON file (tens of millions) on S3 and I need to have some kind of aggregation (perform continuously as the collection grows). The aggregation is mainly counter depending on what’s in the file.
What is the best tool or database (open to anything) to perform such a task?