111.22.333.444 = 3 requests
444.22.333.444 = 2 requests{
"user_signature": "e5c7a9b0c2d1e6f8a3b5c7d9e1f3a5b7c9d1e3f5a7b9c1d3e5f7a9b1c3d5e7f9",
"page_request_signature": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0a1b2",
"hostname": "example.com",
"pathname": "/blog/artikel-123",
"referrer": "google.com",
"screen_width": 1920,
"screen_height": 1080,
"time_on_page": null
}One of our most important development principles is not to store your raw data (IP and user agent) together with your browsing activities. That wouldn't be privacy-friendly enough. Instead, we create an anonymized "signature" that we can use for later visits.
Many analytics providers store your raw IP address and user agent directly alongside your browsing activities. We strictly reject this. We will never do that. We only store raw IP addresses for security purposes, and they don't appear in our customer data exports or dashboards. IP addresses are only examined more closely when we're under a DDoS attack and our protection team needs to identify malicious actors.
Here are the details about our hashes:
After that, we check whether these hashes already exist in our database. If not, it's a new, unique visitor. If they do, it's a returning visitor. If the hashes are new, we add them; they're then automatically deleted at midnight.
The brilliant thing about this hash system? We (or anyone else) can do absolutely nothing with these hashes. We can't "decrypt" them to see personal details. They're only valuable for the duration of a database existence check. Beyond that, they're completely, wonderfully useless:
Even if someone managed to break into AWS, where our servers are hosted, and get their hands on all our hashes and salt keys, there would still be quadrillions of possible combinations for a brute-force attack. Nobody on this planet has the resources to crack that. We've checked the numbers: Your hacking budget would need to be many times the global gross domestic product, hundreds of trillions of dollars.
We round page views up to the nearest full hour instead of capturing them to the second. Why? Because this prevents our access logs (which contain second-based data) from being matched with our database of browsing activities. We do this deliberately to protect your privacy.
{
id: 232332323234234,
user_signature: 5f9b9f01f747722565af71b4e602dc6239f050616b2dfa00944db79b84804c32,
site_id: 1234,
hostname: http://www.milliondollarhomepage.com
pathname: /blog
is_new_visit: true,
is_new_session: true,
is_unique: true,
referrer_hostname: https://bing.com,
referrer_pathname: /about,
timestamp: 2021-01-01 00:01:05,
duration: 0,
}{
site_id: 1234,
hostname: https://milliondollarhomepage.com,
pathname: /blog,
pageviews: 1,
visits: 1,
timestamp: 2021-01-01 00:00:00 (aggregated to current hour)
}Here's an overview of the privacy standards we focus on at bchic analytics, our privacy-friendly analytics service:
Do you have questions about data processing? Contact us anytime!