Attributes of the DataAnalyzr Class

Basic Attributes - Instance configuration attributes. Include analysis_type, params, generator_llm, analysis_llm, context, logger.
Data Related Attributes - Input dataset and vector store connections. Include df_dict, database_connector, vector_store.
Analysis Related Attributes - Values generated during analysis. Include analysis_code, analysis_guide, analysis_output, plot_code.
Output Attributes - Output values returned as responses. Include plot_output, insights_output, recommendations_output, tasks_output, ai_queries_output.

Basic Attributes

analysis_type

Literal['sql', 'ml', 'skip']

The type of analysis to be performed.

params

ParamsDict

Dictionary of class parameters.

Show details

max_retries

integer

Maximum number of retries for the LLM calls and analysis. Default is 10.

time_limit

integer

Time limit in seconds for the LLM calls and analysis. Default is 45 for analysis and 60 for visualisation.

auto_train

boolean

Whether to automatically add questions with their SQL query or Python code to the vector store. Default is True.

generator_llm

LiteLLM

LLM instance for generating analysis. Default LLM used is GPT-4o.

For details on configuring the LLM, see the Large Language Models guide.

Show details

model

string

Name of the LLM model to use.

api_key

string

API key for accessing LLM services. May also be set as an environment variable.

analysis_llm

LiteLLM

LLM instance for performing analysis. Default LLM used is GPT-4o.

For details on configuring the LLM, see the Large Language Models guide.

Show details

model

string

Name of the LLM model to use.

api_key

string

API key for accessing LLM services. May also be set as an environment variable.

context

ContextDict

Context dictionary for the analysis.

Show details

analysis

string

Context for the analysis.

visualisation

string

Context for the visualisation generation.

insights

string

Context for the insights generation.

recommendations

string

Context for the recommendations generation.

tasks

string

Context for the tasks generation.

logger

logging.Logger

Logger object for logging messages.

df_dict

dictionary

Dictionary of dataframes loaded from files or databases.

df_dict = {
    "table_name": pandas.DataFrame,
}

database_connector

DatabaseConnector

Database connector object for connecting to databases.

Show details

host

string

Hostname of the database server. Applicable for PostgreSQL and Redshift databases.

port

string

Port number of the database server. Applicable for PostgreSQL and Redshift databases.

user

string

Username for the database connection. Applicable for PostgreSQL and Redshift databases.

database

string

Name of the database to connect to. Applicable for PostgreSQL and Redshift databases.

password

string

Password for the database connection. Applicable for PostgreSQL and Redshift databases.

schema

list

Schema names to load. Applicable for PostgreSQL and Redshift databases.

tables

list

Table names to load. Applicable for PostgreSQL and Redshift databases.

conn

psycopg2.connect or redshift_connector.connect or sqlite3.connect

Connection object for the database.

vector_store

ChromaDBVectorStore

Vector store object for storing questions and their SQL queries or Python code.

For details on configuring the vector store, see the Vector Store guide.

Show details

path

string

Path to the vector store file.

chroma_client

chromadb.PersistentClient

ChromaDB client object for storing vectors.

documentation_collection

chromadb.Collection

Collection object for storing documentation.

ddl_collection

chromadb.Collection

Collection object for storing DDL queries.

sql_collection

chromadb.Collection

Collection object for storing question and SQL query pairs.

python_collection

chromadb.Collection

Collection object for storing question and Python code pairs.

plot_collection

chromadb.Collection

Collection object for storing question and plot code pairs.

analysis_code

string

Code generated by the LLM for analysis.

analysis_guide

string

Guide used to generate the analysis code.

analysis_output

pandas.DataFrame or dictionary or string

Output generated by executing the analysis code.

plot_code

string

Code generated by the LLM for generating visualisations.

Output Attributes

plot_output

string

Path to a PNG file containing the plot generated by executing the plot code.

insights_output

string

Insights generated by the LLM.

recommendations_output

string

Recommendations generated by the LLM.

tasks_output

string

Tasks generated by the LLM.

ai_queries_output

dictionary

AI queries generated by the LLM.

ai_queries_output = {
    "type_of_analysis1": ["query1", "query2", "query3", "query4"],
    "type_of_analysis2": ["query1", "query2", "query3", "query4"],
    "type_of_analysis3": ["query1", "query2", "query3", "query4"],
}

Cookbooks

​Basic Attributes

​Data Related Attributes

​Analysis Related Attributes

​Output Attributes

Basic Attributes

Data Related Attributes

Analysis Related Attributes

Output Attributes