|
|
|
|
|
|
|
|
abstract
Background
The World Wide Web has emerged as a powerful data source for
epidemiological studies related to infectious disease surveillance.
However, its potential for cancer-related epidemiological discoveries is
largely unexplored.
Methods Using
advanced web crawling and tailored information extraction procedures,
the authors automatically collected and analyzed the text content of
79 394 online obituary articles published between 1998 and 2014. The
collected data included 51 911 cancer (27 330 breast; 9470 lung; 6496
pancreatic; 6342 ovarian; 2273 colon) and 27 483 non-cancer cases. With
the derived information, the authors replicated a case-control study
design to investigate the association between parity (i.e.,
childbearing) and cancer risk. Age-adjusted odds ratios (ORs) with 95%
confidence intervals (CIs) were calculated for each cancer type and
compared to those reported in large-scale epidemiological studies.
Results
Parity was found to be associated with a significantly reduced risk of
breast cancer (OR = 0.78, 95% CI, 0.75-0.82), pancreatic cancer
(OR = 0.78, 95% CI, 0.72-0.83), colon cancer (OR = 0.67, 95% CI,
0.60-0.74), and ovarian cancer (OR = 0.58, 95% CI, 0.54-0.62). Marginal
association was found for lung cancer risk (OR = 0.87, 95% CI,
0.81-0.92). The linear trend between increased parity and reduced cancer
risk was dramatically more pronounced for breast and ovarian cancer
than the other cancers included in the analysis.
Conclusion
This large web-mining study on parity and cancer risk produced findings
very similar to those reported with traditional observational studies.
It may be used as a promising strategy to generate study hypotheses for
guiding and prioritizing future epidemiological studies.
0 comments :
Post a Comment
Your comments?
Note: Only a member of this blog may post a comment.