GitHub

#Tapirus Data tap.

This application collects transactional data from logs and stores them on S3. It provides an API for data retrieval.

##Configuration Refer to the doc file.

##Data Records The files generated by tapirus from events have a standardized structure described here

##API Refer to the API doc

##Playground

You can deserialize payloads in the playground:

http://{host:port}/dev/playground

##Requirements You need to have:

Python 3.4.2+
SQL database (MySQL, MariaDB, SQLite)

Python specific requirements are in the requirements.txt file.

##Deployment

Production/Server

To run the application, you can use the scripts build.sh and run.sh; they build and run a docker container for the application, respectively. Getting a peek at the sample supervisor configuration file you can see which services are required to run the application.

It runs uWSGI, Redis server, RQ workers, Luigi server, Nginx, and application specific services. All of these dependencies are downloaded and setup in the Dockerfile. Though you need to have your configuration files on the project's base path. You can copy the sample files from conf/ and modify them according to your environment. You need a SQL database running, with a database created for the application. Just specify this database, along with connection details in the datastore section of the config.ini file.

Development/Local

To run the application locally though, you just need the Redis server, an RQ worker, and Luigi server. Before that, you can build a virtual env for the app with the build-env.sh script. Currently, Tapirus doesn't require any volume mounting. Downloaded files and Record files are kept in the container for the duration of processing only, and erased immediately after.

##Third Party Services

AWS-S3 Tapirus stores processed data in AWS S3. In order to do that, you need to specify configuration details for AWS S3. There are three fields that are required in the s3 section of the config.ini file:

bucket=myBucket: name of S3 bucket, e.g. predictry
prefix=path/pattern: path, after the bucket, e.g. data/processed
records=records: the name of the folder to store record files

This translates into s3://predictry/data/processed/records/. For data access, tapirus expects a boto.cfg file with the proper credentials, in the project's root file for the docker build process.

Error Reporting Tapirus can send errors generated from logs to a restful endpoint for processing. This endpoint is assumed to be a queue/topic that is capable of receiving at least several hundred messages within a minute or so. An async recipient is ideal. In any case, this process does not interrupt the processing of logs themselves. It is handled by independent workers, that consume the errors messages from an internal queue, managed by RQ.

Name		Name	Last commit message	Last commit date
Latest commit History 380 Commits
conf		conf
doc		doc
scripts		scripts
src/tapirus		src/tapirus
static/templates		static/templates
tests/unit		tests/unit
uml		uml
Dockerfile		Dockerfile
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

About

Uh oh!

Releases 1

Packages

Contributors 3

Uh oh!

Languages

predictry/tapirus

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 3

Uh oh!

Languages

Packages