Richard Green
Richard Green

Reputation: 2062

Python analytics (non web based)

I have published a library that is used in-house. It is not a web based library but it unifies access to several different datasources and provides access in a unified way.

I would like to gather usage statistics of this library - obviously with the proviso that users of the library don't mind these statistics being taken.

Now this is not a web framework or anything similar, but just a bunch of classes and functions .

Obviously the analytics framework must be able to recover from the gathering back end being not available - in fact the usage of the library must preferably be not affected in anyway by data being sent.

Has anybody written anything like this before? Obviously I could knock up one myself, but when presented with questions like this, I always try to find a version of one done already (as they've probably done a better job than I could ever do).

Upvotes: 1

Views: 223

Answers (3)

Unapiedra
Unapiedra

Reputation: 16248

If you are okay to send data out-of-house, here are two solutions with Python support:

  1. Segment
  2. Google Analytics

Segment

Segment seems to be a service that aggregates analytics data and let's you send it on to multiple analytics end points.

The Python API for it is open source and easy to use.

  • One could change the server URL easily.
  • The library is more sophisticated than Google Analytics! In particular, the Segment library uses a queue of events and consumes them in a separate thread. If the connection is blocked/slow this will not slow down the application.

Install:

$ pip install analytics-python

Code:

import analytics
analytics.write_key = 'YOUR_WRITE_KEY'
USER_ID = 'some_random_value'

analytics.identify(USER_ID, {
    'email': '[email protected]',
    'name': 'John Smith',
    'friends': 30
})

def track_function_call(fname, *args, **kwargs):
    analytics.track(USER_ID, 'Function Called', {
        'function_name': fname,
        'args': args, 'kwargs': kwargs
    })

def track_decorator(function):
    def wrapper(*args, **kwargs):
        track_function_call(function.__name__, *args, **kwargs)
        return function(*args, **kwargs)
    return wrapper

@track_decorator
def main():
    print("My app does nothing but track you.")

if __name__ == '__main__':
    main()

Google Analytics

Google Analytics provides an example in their tutorial for App Engine.

It is a crude requests.post call, which will block. But the Consumer class in Segment's API (consumer.py) could probably be adopted to POST directly to Google Analytics.

# From Google's example:
def track_event(category, action, label=None, value=0):
    data = {
        'v': '1',  # API Version.
        'tid': GA_TRACKING_ID,  # Tracking ID / Property ID.
        # Anonymous Client Identifier. Ideally, this should be a UUID that
        # is associated with particular user, device, or browser instance.
        'cid': '555',
        't': 'event',  # Event hit type.
        'ec': category,  # Event category.
        'ea': action,  # Event action.
        'el': label,  # Event label.
        'ev': value,  # Event value, must be an integer
    }

    response = requests.post(
        'http://www.google-analytics.com/collect', data=data)

    # If the request fails, this will raise a RequestException. Depending
    # on your application's needs, this may be a non-error and can be caught
    # by the caller.
    response.raise_for_status()


if __name__ == '__main__':
    track_event(
        category='MyPythonLibrary',
        action='Main called')
    print("My app does nothing.")

Upvotes: 0

bwbrowning
bwbrowning

Reputation: 6530

You can use many 'web' analytics platforms inside of desktop or mobile apps.

Mixpanel is a popular one that I have looked at, but you can use google analytics in this way as well. You basically just will have method calls in your code that call out to the mixpanel server whenever you want to log an event.

It will be easier to use one of these vs inventing your own.

Upvotes: 1

bosnjak
bosnjak

Reputation: 8624

You could do the local logging and an scheduled statistics uploading that would send this log to your server. Ofcourse, the user will have to give his consent, but this is a common practice i guess. For this you can use any logging facility, like python.logging. For uploading to your server you can use any networking library, like twisted.

These two combined give you an almost complete solution, you just have to do some glue logic.

If you want to do it live, while the library is being used (which I am not sure why would you want), you can still use twisted since it has the ability to do asynchronous transfers.

Upvotes: 1

Related Questions