+98 21 8609-3065 h.veisi@ut.ac.ir
Text Mining

Most of valuable data around us is in unstructured format. Discovering worthy knowledge from text which is kind of unstructured data is an important task. Text mining (or text analytics), refers to the process of extracting information from text using machine learning algorithms. Research on text mining n DSPLab covers text categorization, text clustering, concept/entity extraction, sentiment analysis, document summarization and document similarity focusing on Persian language.

 

Text mining, also known as text data mining, is the process of transforming unstructured text into a structured format to identify meaningful patterns and new insights. By applying advanced analytical techniques, such as Naïve Bayes, Support Vector Machines (SVM), and other deep learning algorithms, companies are able to explore and discover hidden relationships within their unstructured data.

Text is a one of the most common data types within databases. Depending on the database, this data can be organized as:

  • Structured data: This data is standardized into a tabular format with numerous rows and columns, making it easier to store and process for analysis and machine learning algorithms. Structured data can include inputs such as names, addresses, and phone numbers.
  • Unstructured data: This data does not have a predefined data format. It can include text from sources, like social media or product reviews, or rich media formats like, video and audio files.
  • Semi-structured data: As the name suggests, this data is a blend between structured and unstructured data formats. While it has some organization, it doesn’t have enough structure to meet the requirements of a relational database. Examples of semi-structured data include XML, JSON and HTML files.

Other Projects

Speech Recognition

Speech Recognition

Automatic Speech Recognition (ASR) denotes to techniques that convert spoken speech into text. Our ASR in DSP Lab concentrates on Persian speech...

read more
Medical Image Processing

Medical Image Processing

Imaging has become an essential component in many fields of medical and laboratory research and clinical practice. Biologists study cells and...

read more
Digital Image processing

Digital Image processing

Image Processing and Computer Vision are fields that include methods for acquiring information from a digital image and understanding it, then...

read more

Notice: ob_end_flush(): Failed to send buffer of zlib output compression (0) in /home/smj97ir/public_html/wp-includes/functions.php on line 5464