The technology and processes of e-health were designed primarily for communication: enabling the fast and easy transmission of information from the patients to the healthcare providers who can treat them. A side effect of this is that, over time, it will generate large databases of measurements, medical imagery, symptom descriptions, diagnostics, etc. The reason for this is that in most cases the data is not meant to be treated in real time but stored to be examined later.
While it is not their primary purpose, the constitution of those databases creates formidable opportunities for researchers to explore health data from hundreds or thousands of patients. Those numbers, much greater that what can be achieved in most regular studies that recruit volunteers, enable scientists to improve patient treatment by looking for patterns on a large scale and draw new conclusions, for example about the link between the development of a medical condition and environmental factors. For example, the Centerstone Research Institute has developed tools for analyzing the treatment of all their patients and discerning the methods that give the best result in order to apply them to future patients.
The drawback is that scientists do not fully control the environment in which patients are treated. They cannot always access all the metrics they want to analyze and have to use only what is already there. This introduces a greater risk of error and misinterpretation than in controlled studies and in turn forces researchers to work with even larger samples in order to distinguish between true correlation and mere coincidence. The emergence of consumer connected health products is a great asset in this case because often it allows scientists to work with samples in the tens of thousands or more.
Some products are designed with future data mining in mind and build upon the data they generate to improve the service they provide. One of the best examples is probably Asthmapolis, a GPS device that attaches to the inhalers of Asthma patients and records their location every time they use it. This not only allows the patients’ physician to see how often they used their inhalers (an important information to make sure that the patients are treating their asthma correctly), but also enables the creation of maps showing where asthma attacks are most common so that patients can try to avoid them. Scientists can also use the data to deduce what local factors may be involved.
The lack of a controlled environment may prevent researchers from obtaining definitive results immediately, but it remains very useful to find new leads and ideas that can then be confirmed (or proven false) through a more traditional, rigorous process.
The whole process came full circle in 2010 when the US Department of Health and Human services invited app developers to explore its vast stores of publically available health data to design new tools and features for connected and mobile health… which in turn will likely generate more data for scientists to sift through.
What about you? What other use of data mining for medical research can you envision? Would you be okay with having your personal data used for those types of projects, even anonymized?