Keeping pace with big data: real time searches
Nowadays, machines are generating data at a rate and scale beyond the volumes of human-generated data for which most data bases were originally architected, observes Steve Marsh, CEO and co-founder of GeoSpock.
"We are generating much more data than we are generating searches, and current databases are starting to fail or lag behind in terms of real time accessibility" he says. "80% of all data has a geotag attached to it, but often, the geolocation is rapidly changing and in order to be meaningful, it should be continuously updated and searchable in real time".
"Big data is slow data unless it is managed correctly. A new generation of applications use time and place to deliver a customer service. By combining this dynamic data with historical information in real time, companies are in a position to predict demand, manage services geographically and optimise their resources", explains Marsh, mentioning telematics or mapping services tied to social applications as obvious examples.
The startup has developed a very efficient multi-dimensional database that features simultaneous data read and write to increase throughput, and solves the ‘heat map’ problem of large amounts of transient, nearby data. It is specifically designed for the storage, search and retrieval of geospatial data in real-time no matter how big it gets or how often it changes.
"A single mobile tower can collect up to 100,000 geotags every second. If telecom companies could monitor every devices in real time, they could offer marketing insights to, says, a supermarket, tell them about which demographics entered the premises, where the customers came from and where they are heading to afterwards. Even as anonymised statistical data, this would be very valuable", Marsh gave as an example.
"Imagine tens of thousands of drones, all having to communicate their position and their sensor data and each having to search for the positions of the others to calculate their geospatial relation to each other for path finding" he mentions as another possible application.
"A differentiator is that we can not only search for real time streaming data, but we can also correlate this to historical data to identify anomalous behaviours".
For now, the company is focusing on 2D geospatial information as a first step for its product development, but it says the multidimensional setup of its database could scale up to search efficiently through very complex multidimensional spaces.
"Voice identification would require searches across 300 dimensions, potentially, facial identification could require searches across billions of records through data vectors of up to 4000 dimensions", continues Marsh who conceived the idea while reading for his PhD in Computer Science at Cambridge University, UK. There he was developing a real-time, extreme-scale super computer for simulating human brain function.
The company is promoting Version 2.1.0 of its GeoSpock API, enabling customers to create, update and search a large, frequently changing set of objects with geospatial locations (GPS coordinates), held on a database service. The ‘locatables’ objects can be retrieved within a latitude–longitude bounding box, or as the nearest objects to a specified geospatial location. The trackable objects can be assigned types and partitioned into collections too for more focused searches.
On its website, GeoSpock claims its database architecture is fit to support billions of locations and receive thousands of concurrent updates every second. The simulation experiment it runs of a large scale social app, tracking 100 million user locations across the world, yields a fifteen fold speed increase compared to a traditional NoSQL database, literally shrinking search response times from minutes to seconds.
Visit GeoSpock at www.geospock.com
If you enjoyed this article, you will like the following ones: don't miss them by subscribing to :
eeNews on Google News