TONG-H

ai

8625Notesjs2025-06-012025-07-03

memory bank

maintain context across sessions

https://forum.cursor.com/t/how-to-add-cline-memory-bank-feature-to-your-cursor/67868

tips

  • each chat should be dedicated to a single feature, to prevent the interference of irrelevant context

reference

https://github.com/zhangchenchen/self-consistent-coder/blob/main/cursor-large-project-tips.md

MCP-ModelContextProtocol

  • allows AI models to access and utilize external resources—such as databases, APIs, and local files—without the need for custom integrations for each data source.

  • it’s particularly beneficial in scenarios where AI models need to interact with multiple data sources or tools.

  • concepts

    • MCP host, an application that integrates ai modules, like cursor
    • MCP client, It functions as a plugin within the host, providing a bridge between the host and the server
    • MCP server, an external server that offers data
    • Tools allow modules take actions through your server
    • Resources provide data to modules, not yet supported in Cursor
      • static resources are equal to upload file, but the file can be dynamic
    • Prompts create a message template, or a message workflow
  • claude desktop failed to run mcp servers, but these servers are work great with cursor

  • the official documents and some articles may not follow the updating of Sdk, as many fields and params have changed

  • debugger

    • npx @modelcontextprotocol/inspector
    • with claude: Open DevTools: Command-Option-Shift-i
  • The tokenization process splits text into smaller units called tokens, usually using a sub-word tokenization technique like Byte Pair Encoding (BPE) or WordPiece. but non-Latin text will be treated differently. Except the obvious characters, additional splits and encoding for sub-word components, punctuation, or any special tokens may exits and various according to the tokenization strategy

browsertools

  • useful for inspecting dom and adjusting style

    • Q: get the selected element. it’s child element has a padding thus is used to make a gap between two elements. the gap is need when the two elements are aligned.

  • get network logs and console logs

    • Q: getnetwork. log the requests with pageSize:20 set

    • this feature is not work as expected, it’s tend to miss logs. the repo are still many unresolved issues related to log retrieval
    • sometimes, wiping logs or closing other F12 panels can help

RAG-RetrievalAugmentedGeneration

  • combines traditional information retrieval with generative models.
  • allowing the model to retrieve relevant information from an external knowledge base (like a document store or search engine) before generating an answer.
    When a query (input) is received:
    Before-retrieval
      Routing
      rewriting
      expansion 
    
    Retrieval
    After-retrieval
    Rerank
    summary
    fusion

Use LangChain and Ollama to go through the RAG workflow

https://js.langchain.com/docs/tutorials/rag

  • Indexing, a pipeline for ingesting data

    • load, load document via document_loaders
    • split, split doc into chunks via text splitters that support four strategies. long documents will be hard to fit into the context window of many models and can be struggle for modules to find information in very long inputs
      • length
      • text-structure, based on paragraphs, sentences, and words
      • document-structure, based on an inherent structure, e.g. HTML, Markdown, or JSON
      • semantic-meaning
    • embed, Wrapper around a text embedding model for converting text to embeddings
    • store, store splits into a VectorStore, allowed to add text and Document objects to the store, and query them using various similarity metrics.
      • Can be in-memory or via third party
      • allow to connect to an existing vector store
      • similaritySearch, similaritySearchWithScore
    • asRetriever, generate a Retriever, specifically a VectorStoreRetriever
      • Retrievers are Runnables, implement a standard set of methods (invoke and batch operations)
      • similaritySearch vs Retriever
  • Retrieval, takes a user query at run time and retrieves the relevant data from the store via retrievers

    • Query analysis
      • can Re-write or expand to improve semantic or lexical searches
      • can translate natural language queries into specialized query languages or filters, like sql, cypher
  • generation, passes the relevant data and question to the model

    • allow to load prompt template from prompt hub

task-master

  • upon on Claude ai(required), Perplexity AI(optional)
  • despite Claude ai, it still can manage tasks manually—creating, editing, and tracking them in tasks.json or via the Task Master CLI/MCP tools.
    • Perplexity AI
      • a search engine powered by models like Claude, GPT, and its own fine-tuning
      • fact-based, best for research (cited answers)
      • Always up-to-date
      • Less conversational
    • Claude
      • more safety and polite.
        • unlike open ai which relying on human feedback to fix bad behavior, Anthropic usesConstitutional AI that includes a set of human principles.
        • sometimes too verbose or cautions
      • best for long document analysis, can handle over 100k tokens and optimized for huge inputs
  • Init
    • use parse_prd to analyze PRD document, and create tasks 
    • a foundational task structure will be created, and used for later tracing
  • update
    • analyze what changed in the PRD and update / add / cancel tasks
    • changes on code need manually implementations
  • tips for better maintenance
    • useanalyze-complexity, based upon Perplexity AI
      • to get to know task complexity level
      • it’s better to break down complex tasks with expand
    • Periodically validate and fix invalid or circular dependencies.