AI and Language: Basic topics to learn, use for NLP and train - Brain

This article presents essential information for language acquisition for both humans and computer software, particularly artificial intelligence (AI) and natural language processing (NLP) implementations. It defines key topics in a relevant order, outlines required information, and suggests actionable steps for self-learning and model training.

The primary focus is on leveraging data, patterns, and identifying the most effective methods to achieve the goal of learning a new language, or developing and training a model for AI or NLP purposes.

This resource can be applied to any language, and continual improvement is possible through implementation results and gathering feedback.

The key principles at start level:

  • The structure of language in a learning process:
    • word
    • phrases
    • sentences
    • paragraph
    • compositions
  • High frequency words.
  • Words in context.
  • Intensive and extensive exposition to the language.
  • Apply the AI and NLP principles, and use the available tools. 

Specific Actions and base information:

  • Phonetic Alphabet - IPA (It look hard but is easy).
    • Vowels (15) and consonants sound.
    • The music principle: "You can't listen what you don't know".
  • 200 Most Common Words - The right pronunciation -.
  • See the Oxford list- An overview and outline by topics-
    • 2000 and 5000 word list.
    • Phrase list (Oxford and OPAL):
        • The Spoken.
        • The Written. 
  • Sentences Structure.
    • A good grammar resource.
  • Text Patterns (the right way to a good speech and write).
  • Listen (shadowing repeat) and Write (read and listen to write, transcribe).
    •  This intensive reading of the language material will help us a lot of, be focus in it. - comprehension-.
    • Extensive reading, listening and transcribing through all the process and steps to archive advance level.
    • Engage in conversations (either in person or online).
  • AI use: software that check your writing and spell.
  • Go just direct to the advance topics.

You cant learn to write well only writing.

You can improve your speech skills by speaking (fixing errors and get new strategic). 

 Word in context, avoid learn just words.


Reading Techniques:

  1. Skimming (get a basic idea about the text -magazine, newspaper)
  2. Scanning (quickly scan to get a specific information)
  3. In-depth reading (after skimming)
    1. Intensive (time consuming, high focus, best way to learn - intensive practice before exam-)
    2. Extensive (evolve enjoyment)

 For a Test:

  1. Grammar and Vocabulary (rules, tenses, parts of speech, sentence structure and word in context).
  2. Reading Comprehension (public articles).
  3. Listening Skills (podcasts).
  4. Speaking Practice.
  5. Writing Skills (write essay).
  6. Practice/run Tests

Additional Resource:

 -------------------------------------------------------------------------------------------------------------------------

 Patterns - Training a Model or Training yourself

AI Patterns applied to life.

The statistics information about words is relevant.

  • Dimension: How many words the language have?
    • Alphabet (symbols and sound)
    • Words
  • Prioritization/Optimization:
    • Word list by frequency

Implementation datasets source:

Data structure [PENDING - in writing process]

Implementation actions and tasks:

  •  Cluster and aggregation (group and select words for your nearest knowledge <area>)
  •  Basic and extended patterns (get the basic idea but reinforce with the corpus):
    • Sentence patterns
    • Paragraph and text
    • Speech patterns
  •  Corpus and data (reinforcement)
    • Text (great writers)
    • Video, audio (podcast, audio book, music,
    • Dialogue and public speech
  •  Detect errors (get it frequency with checklist rules for fix and mitigate)
    • Immersion and test (simultaneous and continuous)
  •  Place and real people (heavily used).

Train:

  • Language corpus text: as many text as possible (write, transcribe and ride)
  • Recorded audio and videos: process the sounds.

Learning and Brain Process

Curiosity (retaining it): "A Scientist is a kid but not physically or mentally". - Neil deGrasse Tyson

Brain and Creativity

  • Can we learn forever?
  • Creativity
  • Convergence and Divergence
    • My hypothesis: As we form concrete and closed ideas, we gradually lose the ability to observe and discover new things, blocking the brain's learning ability in the process.
    • As we age, we become more resilient. The brain removes connections that we don't use (weak connections).
  • Synapse: As we learn new things, we create new connections (synapses) or strengthen weak connections.

  • Innovation: Implied the result in positive change (mix the old with the new or two different world/perspectives) or improvement (new creation, application or discover).

Relevant information:

  • The Economist - Your brain from birth to death
  • Brain synapses and Alzheimer.





Data Base:

https://github.com/chrplr/openlexicon/blob/master/datasets-info/README.md

 

in Python:

  import pandas as pd
  lex = pd.read_csv('http://www.lexique.org/databases/Lexique383/Lexique383.tsv', sep='\t')
  lex.head()

in R:

  library(readr)
  lex = read_tsv('http://www.lexique.org/databases/Lexique383/Lexique383.tsv')
  head(lex)

 

Francais Lexicon Data Base

https://github.com/chrplr/openlexicon

 

 

 

Language Models

https://ai.meta.com/blog/5-steps-to-getting-started-with-llama-2/ 

Entradas populares

SQL