Natural LanguageProcessing

Datasets, Research, Analyses, Conferences about NLP.


๐Ÿ’ฌ Whatsapp Chat for "Natural Language Processing" ๐Ÿ’ฌ


Link Drop

Got another link for this section? Put it here.



โœ๏ธ Edit this document to make it better โœ๏ธ


See also: Sentiment and Opinion Analysis

๐Ÿ”—https://coronavirustechhandbook.com/sentiment

See also: Misinformation

๐Ÿ”—https://coronavirustechhandbook.com/misinformation

See also: Science

๐Ÿ”—https://coronavirustechhandbook.com/science

Introduction

Many valuable insights and information may be contained in vast quantities of text and speech data. Thousands of previously published research articles (and those being published on a daily basis) on Coronavirus may shape our understanding of the latest virus (SARS-CoV-2) or support best practice clinical management of the disease. Analysis of millions of social media posts can help us understand how the public at large is responding to the outbreak. Identifying spreading misinformation can be critical to public health messaging. Automatic identification and organization of helpful information collected from the web can aid the public response.

Research areas include:

  • Text mining of scientific literature related to COVID-19 (e.g. CORD-19 dataset)
  • Analysis of text from the web, social media or clinical data in support of public health activities related to COVID-19. Text analysis topics include: sentiment, mental health, well-being related to COVID-19.
  • Analysis of the collateral effects of COVID-19 using text. Collateral effects include anything that is happening as a result of the virus, including economic effects.
  • Multi-lingual or cross-lingual analysis of COVID-19 related textual data
  • Semantic search of COVID-19 related textual data
  • Chatbots and other interactive support systems related to COVID-19
  • Analysis of spoken language related to COVID-19


NLP = Natural Language Processing

Datasets & Models

COVID-19 Open Research Dataset

CORD-19

๐Ÿ”—https://cset.georgetown.edu/covid-19-open-research-dataset-cord-19/
๐Ÿ”—https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge
CORD-19 is a resource of over 57,000 scholarly articles, including over 45,000 with full text, about COVID-19, SARS-CoV-2, and related Coronaviruses. This freely available dataset is provided to the global research community to apply recent advances in natural language processing and other AI techniques to generate new insights in support of the ongoinsdfg fight against this infectious disease. The collection will be updated as new research is published in peer-reviewed publications and archival services like bioRxiv, medRxiv, and others.
CORD-19 dataset, hosted by Kaggle.

Elsevier Coronavirus Dataset

๐Ÿ”—https://www.elsevier.com/connect/coronavirus-initiatives
Linguamatics has processed these documents to create a new index of these documents using our ScienceDirect indexing settings. More information about the dataset, including how long it is available for, can be found in theย Elsevier Coronavirus Center.

Dimensions

๐Ÿ”—https://www.dimensions.ai/news/dimensions-is-facilitating-access-to-covid-19-research/
Free access to COVID-19 publications, datasets and clinical trials

COVID-19 corpus

๐Ÿ”—https://www.sketchengine.eu/covid19/
SketchEngine has tokenized, POS-tagged, and lemmatized the text.

CORD-19 (COVID-19 Open Research Dataset)

๐Ÿ”—http://pubannotation.org/collections/CORD-19
PubAnnotation team is collecting annotations.

Building a Pandemic Retrieval Test Collection

๐Ÿ”—https://dmice.ohsu.edu/hersh/COVIDSearch.html
OHSU is soliciting queries for retrieval topics.

LitCovid

๐Ÿ”—https://www.ncbi.nlm.nih.gov/research/coronavirus/ย 
A curated literature hub for tracking up-to-date scientific information about the 2019 novel Coronavirus.

COVID-19 Twitter data sets

๐Ÿ”—http://www.panacealab.org/covid19/
๐Ÿ”—https://github.com/echen102/COVID-19-TweetIDs


COVID-19 Data Resources

๐Ÿ”—http://covid19dataresources.org
This site serves as a platform for collecting data resources and publications in the fight against COVID-19. These resources are focused on social media data and how it can be used to prevent the spread of COVID-19. Possible applications include the combating of misinformation, supporting messaging from public health organizations and tracking information about the ongoing COVID-19 pandemic.

Pretrained NLP Models

SciBERT

๐Ÿ”—https://github.com/allenai/scibert
A SciBERT is aERT model for scientific text.

Presentations of NLP

See also: Sentiment Analysis

๐Ÿ”—https://coronavirustechhandbook.com/

COVID-19 Global Dashboard

๐Ÿ”— https://covid19.linguamatics.com
Using Linguamatics NLP to extract COVID-19 relevant abstracts and trial locations from MEDLINEยฎ and ClinicalTrials.gov

Articles

Artificial Intelligence against COVID-19

๐Ÿ”— https://towardsdatascience.com/artificial-intelligence-against-covid-19-an-early-review-92a8360edaba
AI has not yet made an impact, but data scientists have taken up the challenge

Tools

See also: Remote Working

๐Ÿ”— https://coronavirustechhandbook.com/remote

Figshare

๐Ÿ”— https://figshare.com/
Publish your conference outputs freely and openly

Overleaf

๐Ÿ”— https://www.overleaf.com/
Collaborate on academic writing

Writefull

๐Ÿ”— https://writefull.com/researchers.html
Real-time help with academic writing

Altmetric

๐Ÿ”—https://www.altmetric.com/resources-trending-research
Within theย Altmetric Explorerย you can find online attention data for millions of research articles, clinical trials, datasets and more.

Conferences

NLP COVID-19 Workshop @ACL2020

๐Ÿ”—https://www.nlpcovid19workshop.org
Paper submission deadline June 30, 2020

NLP Competitions

The Kaggle CORD-19 challenge

๐Ÿ”—https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge
We Kaggles issuing a call to action to the world's artificial intelligence experts to develop text and data mining tools that can help the medical community develop answers to high priority scientific questions.sentiment

Dashboards

COVID-19 Primer

๐Ÿ”—https://covid19primer.groupby pandasgroupby pandascom/dashboard
COVID-19 analysis- Quickly understand the scientific progress in the fight against COVID-19. Using the most advanced NLP algorithms, read summaries and discover trends in the latest research papers and the conversations around them.ย Every 24hrs.

COVID-19 Global Dashboard

๐Ÿ”— https://covid19.linguamatics.com
Using Linguamatics NLP to extract COVID-19 relevant abstracts and trial locations from MEDLINEยฎ and ClinicalTrials.gov

Articles

Artificial Intelligence against COVID-19

๐Ÿ”— https://towardsdatascience.com/artificial-intelligence-against-covid-19-an-early-review-92a8360edaba
AI has not yet made an impact, but data scientists have taken up the challenge

Tools

See also: Remote Working

๐Ÿ”— https://coronavirustechhandbook.com/remote

Figshare

๐Ÿ”— https://figshare.com/
Publish your conference outputs freely and openly

Overleaf

๐Ÿ”— https://www.overleaf.com/
Collaborate on academic writing

Writefull

๐Ÿ”— https://writefull.com/researchers.html
Real-time help with academic writing

Altmetric

๐Ÿ”—https://www.altmetric.com/resources-trending-research
Within theย Altmetric Explorerย you can find online attention data for millions of research articles, clinical trials, datasets and more.

Conferences

NLP COVID-19 Workshop @ACL2020

๐Ÿ”—https://www.nlpcovid19workshop.org
Paper submission deadline June 30, 2020

NLP Competitions

The Kaggle CORD-19 challenge

๐Ÿ”—https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge
We Kaggles issuing a call to action to the world's artificial intelligence experts to develop text and data mining tools that can help the medical community develop answers to high priority scientific questions.