Supercapacitor Materials Database Generated using Web Scrapping and Natural Language Processing

J Mol Graph Model. 2025 May:136:108980. doi: 10.1016/j.jmgm.2025.108980. Epub 2025 Feb 13.

Abstract

Electrochemical energy storage plays a vital role in achieving environmental sustainability. Supercapacitors emerge as promising alternatives to batteries due to their high-power density and extended lifespan. Extensive scholarly research has been conducted on supercapacitor energy storage, providing valuable insights into materials and performance parameters. This study presents a comprehensive supercapacitor materials database, created by web scraping the article abstracts from the Scopus database and processing them using Regular Expressions, the BatteryBERT Language Model, and the ChemDataExtractor Python package. The final database comprises 28,269 recorded entries across 21 relevant fields, including metadata, electrode and electrolyte materials, and seven key device performance parameters. This initiative aims to establish a novel database that can support the prediction and design of advanced supercapacitors.

Keywords: BatteryBERT model; Database; Natural language processing; Supercapacitor; Web scrapping.

MeSH terms

  • Databases, Factual*
  • Electric Capacitance*
  • Electric Power Supplies
  • Electrodes
  • Internet*
  • Natural Language Processing*