Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is awesome, and I really appreciate the effort toward privacy/transparency. Along those lines, supporting do not track is great, but why use GA at all? Just implementation ease? Is this something you plan to move away from?


Thanks!

Well, GA is quite convenient - we get pretty graphs, realtime analytics and so on. It's not something we have considered moving away from, since it's trivial to disable it entirely. And it's not significantly worse than any other tracking tool.


I'd say it's substantially different from hosting your own Piwik, OWA, or even something like snowplow - where you could elect to avoid IP storage.

That said, those all entail a lot of work and/or additional cost. You're also absolutely right that allowing users to disable it (and ads) is an amazing feature.


I've found piwik to be unusable for large datasets


How large?

Piwik is really ancillary to the discussion at hand, but I often see the claim that Piwik can't handle busy sites, and it's important to quantify the claim.

I've had success (and others report similar behavior) with 500,000+ hits per day. http://piwik.org/docs/optimize/ reports adequate support to higher levels. It's quite easy to set this up with EC2 + RDS, and using autoscaling gets you a very resilient solution that can easy handle those numbers. Also, in the case of mediacru.sh, many of the optimizations have little impact since they optimize for reporting on the already-gathered analytics. With only two analytics viewers/users - this is not much of an issue.

If you're doing more than 1mil per day, then I think something like snowplow, a commercial solution, or a fully custom solution are appropriate anyway.


You might want to consider self hosted analytics. I've heard a lot of talk about Piwik, though I've not done much with it myself.


In addition to what jdiez had to say about GA - we're trying to understand our audience a little better. MediaCrush uses tons of new web tech that won't work on outdated browsers, and GA helps us get an easy look at support for things like that. Also tells us what kind of media is most popular, and who's sending us traffic, which is just kind of nice to know.


I want to like you, but for me using GA nullifies every nice thing you say in https://mediacru.sh/serious

There is no excuse for not running your own: http://demo.piwik.org/

Also worth reading: http://manurevah.com/blah/en/blog/Like-this-if-You-are-Again...


I'd volunteer to set it up for you on a virtual machine for free.


Okay, I understand that using Google Analytics when we're so pro-privacy is a bit of a weird choice.

We've realised that self-hosting our analytics might be a better choice. I've created an issue[1] to discuss this matter. Ideally what we'd want is something as close to GA as possible - real time analytics being reasonably important.

Note: we were aware of the implications of using GA on the site, but since we offer the ability to disable them very easily we didn't think it was a big deal. That's a mistake on our part, so let's discuss how to fix it.

[1] https://github.com/MediaCrush/MediaCrush/issues/117


LOL! Okay, we don't like Piwik because it's written in PHP, so fsck privacy, google can haz it. This is truly sad.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: