stevec
stevec

Reputation: 52658

How to store heroku logs for data science purposes?

We can see how to view heroku logs, as well as how to write the last n lines as a text file.

Is there any established pattern for sensible and easy log storage, (potentially ETL), and analysis?

At least, this would involve:

  1. storing logs
  2. moving logs (e.g. via an ETL) to somewhere they can be analysed en mass (e.g. AWS S3 or GCP GCS)

Is there any established pattern to achieve this?

Background

Why would anyone want logs en mass? In case it's relevant, a specific task I'm trying to achieve is to use bayesian inference on web logs to answer questions like: "if a person clicked on A, B and C, then they're x% likely to click on D" (so as to better understand which other pages a user may be interested in, and therefore suggest more relevant pages to the user). This is all pretty straight forward in python or R. But obviously one needs access to the logs (all the logs) before such data science can be carried out.

What I know so far

enter image description here

Upvotes: 0

Views: 148

Answers (1)

Nuclearman
Nuclearman

Reputation: 5314

Really the best solution is probably to setup the heroku app to also pipe your logs into an S3 bucket or something like that. Though perhaps you want to set it up so it only sends the log data you are actually interested in. Even better if you can get something that does this for you.

Looks like PaperTrail at least allows this. Here is the current documentation link: https://documentation.solarwinds.com/en/Success_Center/papertrail/Content/kb/how-it-works/automatic-s3-archive-export.htm?cshid=pt-how-it-works-automatic-s3-archive-export

Though it might get rather costly depending on the volume of logs you need to handle to use an outside service. Otherwise, you may just need to roll your own solution (or better yet, look for gems that can help)

Upvotes: 1

Related Questions