diff --git a/Getting_Started/Teradataml_Widgets/Analytic_Functions_Tutorial.ipynb b/Getting_Started/Teradataml_Widgets/Analytic_Functions_Tutorial.ipynb new file mode 100644 index 00000000..e571e7f4 --- /dev/null +++ b/Getting_Started/Teradataml_Widgets/Analytic_Functions_Tutorial.ipynb @@ -0,0 +1,577 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "a8a3245e", + "metadata": {}, + "source": [ + "
\n", + "

\n", + " Using the Analytic_Functions UI in Teradataml Widgets\n", + "
\n", + " \"Teradata\"\n", + "

\n", + "
" + ] + }, + { + "cell_type": "markdown", + "id": "185f9c94", + "metadata": {}, + "source": [ + "

Introduction

\n", + "\n", + "

The Teradataml Widgets (teradatamlwidgets) enhances teradataml’s built-in interaction capabilities with the Teradata Vantage™ Data and Analytics Platform. This provides visual components for scaled, in-Database Analytics with data that you keep in the Teradata Vantage Analytics Database within a notebook.

\n", + "\n", + "

With these components, in this notebook you will be able to:

\n", + "\n", + "" + ] + }, + { + "cell_type": "markdown", + "id": "b5f8a3b0-9a4b-4558-a71d-68e483821bc1", + "metadata": {}, + "source": [ + "
\n", + "\n", + "##### For ClearScape:\n", + "

Install/update packages - Do this if you have any issues executing the cells in this notebook.

\n", + "

NOTE: If you update the teradatamlwidgets library, you will need to restart the Kernel.

" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "faa4297d", + "metadata": {}, + "outputs": [], + "source": [ + "#pip install teradatamlwidgets" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5cc3b253", + "metadata": {}, + "outputs": [], + "source": [ + "#from teradataml import create_context, get_context, remove_context\n", + "#%run -i ../../UseCases/startup.ipynb\n", + "#eng = create_context(host = 'host.docker.internal', username='demo_user', password = password)\n", + "#print(eng)" + ] + }, + { + "cell_type": "markdown", + "id": "a6505f46", + "metadata": {}, + "source": [ + "
" + ] + }, + { + "cell_type": "markdown", + "id": "4203bda2", + "metadata": {}, + "source": [ + "#### Code Explanation\n", + "Below is the basic set up of the notebook with just the mandatory parameters for analytic_functions.Ui\n", + "1. Import the notebook using the code : \n", + "\n", + "`from teradatamlwidgets import analytic_functions`\n", + "\n", + "2. Set up the inputs, this is a list with whichever input table is desired. The schema name is attached to this i.e. “dssDB.company1_stock”, dssDB is the schema and company1_stock is the table name.\n", + "\n", + "\n", + "3. Set up the output name of the function. The schema name is also attached here. This will save the output to the execution of the function to this table name chosen under the schema name. This is optional, if not specified, a name will be generated at random." + ] + }, + { + "cell_type": "markdown", + "id": "7790c722", + "metadata": {}, + "source": [ + "#### After Running the Cell\n", + "Execute the next cell to Run the database login widget. This login widget has a place for the Host, Username, Password, and VALIB.\n", + "\n", + "Once you type in this information, click Login. \n" + ] + }, + { + "cell_type": "markdown", + "id": "6195dde6-93c5-463e-a936-e8859e4e0738", + "metadata": {}, + "source": [ + "### Connect to the Database\n", + "\n", + "Run the next cell below. This will open a login widget. Use these values to connect to the database:
\n", + "- **Host** = 'host.docker.internal'\n", + "- **Username** ='demo_user'\n", + "- **Password** = the password you entered when you created this environment.
\n", + "- **VAL** = The default location for the VAL database. This value will be provided in the login widget.\n", + "\n", + "Click the **Login** button to connect." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "57105370", + "metadata": {}, + "outputs": [], + "source": [ + "from teradatamlwidgets import login\n", + "\n", + "ui = login.Ui(val_location=\"VAL\")" + ] + }, + { + "cell_type": "markdown", + "id": "9b9f9d5b", + "metadata": {}, + "source": [ + "### Load Tables\n", + "\n", + "In this example we will load some tables using teradataml. As we have already logged in, we can call teradataml load functions:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "de8de54e", + "metadata": {}, + "outputs": [], + "source": [ + "from teradataml import *\n", + " # Load the example data.\n", + "load_example_data(\"movavg\", \"ibm_stock\")\n", + "load_example_data(\"teradataml\", \"titanic\")\n" + ] + }, + { + "cell_type": "markdown", + "id": "cefe68d9", + "metadata": {}, + "source": [ + "
\n", + "\n", + "## Class: analytic_functions.Ui\n", + "## Constructor" + ] + }, + { + "cell_type": "markdown", + "id": "3c00af8c", + "metadata": {}, + "source": [ + "| Argument | Type | Required | Description | Example |\n", + "|:--------:|:--------:|:--------:|:--------:|:--------:|\n", + "| outputs | List String | Optional | A list with output table(s) name. For however many tables the function outputs, it requires that many names to assign to each output table name. It can be written _table_name_ or _schema_name.table_name_ (schema is optional). If not specified, a name will be generated at random. | outputs = [“dssDB”.”my_output”, ”my_test”] |\n", + "| inputs | List String | Mandatory | Option 1: A list with whichever input table(s) is desired. The tables that are listed will be the options for you to choose from when you choose the function. Written as _table_name_ or _schema_name.table_name_ (schema is optional) Option 2: A teradataml dataframe.| inputs = [“dssDB”.”company1_stock”, ”titanic”] OR inputs = [DataFrame(“company1_stock”)] |\n", + "| function | String | Optional | If a specific function is desired to be selected immediately when the UI shows up, then include the function name. | function=\"Linear Regression VAL\" |\n", + "| export_settings | String | Optional | In order to load and save your chosen parameters to a file, then set this filename export_settings. | export_settings=\"MyLinReg.json\"|" + ] + }, + { + "cell_type": "markdown", + "id": "521b8d2e", + "metadata": {}, + "source": [ + "
\n", + "\n", + "### Notebook Toolbar Button Functionality" + ] + }, + { + "cell_type": "markdown", + "id": "b7adbe79", + "metadata": {}, + "source": [ + "#### Execute: \n", + "Click this button once you are ready to execute the function after choosing the appropriate desired values for the parameters. It will display as a table with the first couple of rows.\n", + "\n", + "\n", + "#### Query: \n", + "Click this button if you want to see the query created based on the parameters chosen for the function before you execute the query.\n", + " \n", + "\n", + "#### Reset: \n", + "Click this button if you want to reset the function to the defaults i.e. remove the values you have chosen and restart from scratch.\n", + "\n", + "#### Log Out: \n", + "Click this button if you want to log out, which will take you to the login dialog.\n", + "\n", + "\n", + "#### OPTIONAL: Load and Save\n", + "The buttons Load and Save will only show if the 'export_settings' is included in the constructor of analytic_functions.Ui\n", + "\n", + "If this is your first time using this file then once all the values are entered as desired, then click the “Save” button to save the choices out to that specified file name.\n", + "\n", + "This way, when you rerun the cell (and include the export_settings filename) then the values saved will automatically come up when the cell is run.\n", + "\n", + "If you then make changes on top of this file, and you want to revert back to the original values saved to the file, then click the “Load” button. \n", + "\n", + "If you make changes on top of this file and want to now save it to the new values chosen, then click the “Save” button which will rewrite the contents inside the file to the new chosen values." + ] + }, + { + "cell_type": "markdown", + "id": "1d230bf1", + "metadata": {}, + "source": [ + "
\n", + "\n", + "## Method : get_output_dataframe" + ] + }, + { + "cell_type": "markdown", + "id": "90904115", + "metadata": {}, + "source": [ + "| Argument | Type | Required | Description | Example |\n", + "|:--------:|:--------:|:--------:|:--------:|:--------:|\n", + "| output_index | Int | Optional (Default: 0) | Use this function to get the full output result table. | dataframe = ui.get_output_dataframe()" + ] + }, + { + "cell_type": "markdown", + "id": "e615b138", + "metadata": {}, + "source": [ + "#### Return Value: Type: teradataml.DataFrame. Returns the output of the function as a teradataml DataFrame.\n", + "\n", + "See example of this below in \"Output Dataframe\"" + ] + }, + { + "cell_type": "markdown", + "id": "a5f35691", + "metadata": {}, + "source": [ + "
\n", + "\n", + "## SQLE Example : Moving Average" + ] + }, + { + "cell_type": "markdown", + "id": "995f705b", + "metadata": {}, + "source": [ + "For the function “MovingAverage”, set the parameters for the function to the following in this example:\n", + "1. In the **_Required_** tab_, click the **input** dropdown. \n", + "- **Table** is “ibm_stock”
\n", + "- **Data Partition Option** will not be used because Data By and Order By have no values
\n", + "\n", + "\n", + "2. In the **_Optional_** tab\n", + "- **TargetColumns** needs to be set to “stockprice”. Click in the box and select it from the list.
\n", + " - **IncludeFirst** should be selected.
\n", + "- **Alpha** should be set to 0.22
\n", + "- **StartRows** should be 3
\n", + "- **WindowSize** should be set to 11
\n", + "- **MavgType** should be “M”\n", + "\n", + "3. Click the **Execute** button." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "37fd8c02", + "metadata": {}, + "outputs": [], + "source": [ + "from teradatamlwidgets import analytic_functions\n", + "\n", + "inputs = ['ibm_stock']\n", + "outputs = ['Project_OutMovingAverageTest']\n", + "ui = analytic_functions.Ui(function= 'MovingAverage',\n", + " outputs=outputs, \n", + " inputs=inputs)\n" + ] + }, + { + "cell_type": "markdown", + "id": "9bdbcd2b", + "metadata": {}, + "source": [ + "
\n", + "\n", + "## SQLE Example : Moving Average with parameters from a file" + ] + }, + { + "cell_type": "markdown", + "id": "08a4b761", + "metadata": {}, + "source": [ + "As explained in OPTIONAL: Load and Save section above, if the export_settings is set to a json, it will automatically load the parameters that are set in that JSON. You can then modify the parameters if needed and click Save button which will update the JSON accordingly.\n", + "\n", + "In this example notice there is \"export_settings\" added in the constructor of analytic_functions.UI, this way you simply have to run the cell and the parameters will already be set.\n", + "\n", + "The parameters this time are already set to the following below:\n", + "1. In the **_Required**_ tab
\n", + "- **Table** is set to “ibm_stock”. You will need to click the **input** dropdown to verify.
\n", + "- **Data Partition Option** is set to \"None\"\n", + "\n", + "2. In the **_Optional_** tab
\n", + "- **TargetColumns** is set to “stockprice”\n", + "- **IncludeFirst** should be _selected_
\n", + "- **Alpha** is 0.22
\n", + "- **StartRows** is 3
\n", + "- **WindowSize** is 11
\n", + "- **Mavg Type** is “M”
\n", + "\n", + "3. Click the **Execute** button." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ed56c1ec", + "metadata": {}, + "outputs": [], + "source": [ + "from teradatamlwidgets import analytic_functions\n", + "\n", + "inputs = ['ibm_stock']\n", + "outputs = ['Project_OutMovingAverageTest']\n", + "ui = analytic_functions.Ui(function= 'MovingAverage',\n", + " outputs=outputs, \n", + " inputs=inputs, \n", + " export_settings=\"data/MovingAverage.json\")\n" + ] + }, + { + "cell_type": "markdown", + "id": "de32dad3", + "metadata": {}, + "source": [ + "
\n", + "\n", + "## Output Dataframe" + ] + }, + { + "cell_type": "markdown", + "id": "e792dcc1", + "metadata": {}, + "source": [ + "In order to access the full output table, use ui.get_output_dataframe()." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "80757c77", + "metadata": {}, + "outputs": [], + "source": [ + "df = ui.get_output_dataframe()\n", + "df" + ] + }, + { + "cell_type": "markdown", + "id": "b725d036", + "metadata": {}, + "source": [ + "
\n", + "\n", + "## Pipeline Structure" + ] + }, + { + "cell_type": "markdown", + "id": "47009045", + "metadata": {}, + "source": [ + "## VAL Example: Linear Regression" + ] + }, + { + "cell_type": "markdown", + "id": "7d62b2aa", + "metadata": {}, + "source": [ + "The VAL examples of Linear Regression and then Linear Regression Predict demonstrates the possible pipeline structure. In other words we will create Linear Regression VAL output that will be used as an input to Linear Regression VAL Predict.\n", + "\n", + "The code below has the input, output, function name, and file name specified already.\n", + "The filename will automatically populate the function with the parameters." + ] + }, + { + "cell_type": "markdown", + "id": "66504e6c", + "metadata": {}, + "source": [ + "After running the cell, the parameters as mentioned before are already filled out. \n", + "\n", + "1. In the **_Required_** tab
\n", + "- **Table** is “titanic”. Click the **input_A_role** dropdown to verify.
\n", + "- **InputColumns** are “age” and “p_class”
\n", + "- **Response Column** is “fare”\n", + "\n", + "\n", + "2. In the **_Optional_** tab
\n", + "- There are no **Group Bycolumns**
\n", + " - No check boxes are selected for the parameters
\n", + "- **Condition Index Threshold** is set to 37
\n", + "- **Entrance Criterion** is 3.84
\n", + "- **Remove Criterion** is 3.84
\n", + "- **Variance Proportion Threshold** is 0.5\n", + "\n", + "3. Click the **Execute** button." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "8975be26", + "metadata": {}, + "outputs": [], + "source": [ + "from teradatamlwidgets import analytic_functions\n", + "\n", + "inputs = ['titanic']\n", + "outputs = ['Project_OutLinearRegression']\n", + "ui = analytic_functions.Ui(outputs=outputs, \n", + " function=\"Linear Regression\", \n", + " inputs=inputs, \n", + " export_settings=\"data/LinReg.json\")\n" + ] + }, + { + "cell_type": "markdown", + "id": "c20c6839", + "metadata": {}, + "source": [ + "After you click Execute, you see the resulting table. We want to use this table as the input to the Linear Regression Predict in VAL." + ] + }, + { + "cell_type": "markdown", + "id": "6a22a431", + "metadata": {}, + "source": [ + "## VAL Example: Linear Regression Predict" + ] + }, + { + "cell_type": "markdown", + "id": "678fc8e6", + "metadata": {}, + "source": [ + "This is step 2 of the pipeline structure.\n", + " \n", + "Notice that the input is Titanic but ALSO has “dssDB. Project_OutLinearRegression”, this was the output table name we specified in the previous cell for Linear Regression VAL. \n", + "\n", + "Once again, the export_settings has the saved values for the different parameters.\n", + "\n", + "1. In the **_Required_** tab
\n", + "- The **input_A_role** dropdown has **Table** set to “titanic”
\n", + "- The **linear_regression** dropdown has **Table** set to “Project_OutLinearRegression”, which is the output from the previous cell.
\n", + "\n", + "2. In the **_Optional_** tab
\n", + "- **Index Columns** has “SURVIVED” already selected
\n", + "- **Response Column** has “fare” already selected
\n", + "- **Accumulate** has “age” and “p_class” already selected
\n", + "\n", + "3. Click the **Execute** button." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a88bccb4", + "metadata": {}, + "outputs": [], + "source": [ + "from teradatamlwidgets import analytic_functions\n", + "\n", + "inputs = ['titanic', 'Project_OutLinearRegression']\n", + "outputs = ['Project_OutLinearRegressionPredict']\n", + "ui2 = analytic_functions.Ui(outputs=outputs,\n", + " function=\"Linear Regression Predict\", \n", + " inputs=inputs, \n", + " export_settings=\"data/LinRegPredict.json\")" + ] + }, + { + "cell_type": "markdown", + "id": "ebc976b8", + "metadata": {}, + "source": [ + "After executing, this is the Linear Regression Prediction. This was calculated using the Linear Regression VAL function we created above. " + ] + }, + { + "cell_type": "markdown", + "id": "43c7d1df-6c38-4ba4-bbda-73bfdf72b04e", + "metadata": {}, + "source": [ + "## Close the Database Connection\n", + "\n", + "When you have finished executing these teradataml features, please close your connection to the database." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "bfa0afff-2362-4f7d-8cd3-3c0ed0369932", + "metadata": {}, + "outputs": [], + "source": [ + "remove_context()" + ] + }, + { + "cell_type": "markdown", + "id": "5686d09b-cf1f-43d4-b43c-bb2be1cf1268", + "metadata": {}, + "source": [ + "" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.14" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/Getting_Started/Teradataml_Widgets/AutoML_Tutorial.ipynb b/Getting_Started/Teradataml_Widgets/AutoML_Tutorial.ipynb new file mode 100644 index 00000000..84323833 --- /dev/null +++ b/Getting_Started/Teradataml_Widgets/AutoML_Tutorial.ipynb @@ -0,0 +1,327 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "371d27f5", + "metadata": {}, + "source": [ + "
\n", + "

\n", + " Using AutoML with Teradataml Widgets\n", + "
\n", + " \"Teradata\"\n", + "

\n", + "
" + ] + }, + { + "cell_type": "markdown", + "id": "56d7d758", + "metadata": {}, + "source": [ + "

Introduction

\n", + "\n", + "

The Teradataml Widgets (teradatamlwidgets) enhances teradataml’s built-in interaction capabilities with the Teradata Vantage™ Data and Analytics Platform. This provides visual components for scaled, in-Database Analytics with data that you keep in the Teradata Vantage Analytics Database within a notebook.

\n", + "\n", + "

With these components, in this notebook you will be able to:

\n", + "\n", + "" + ] + }, + { + "cell_type": "markdown", + "id": "6a011d3f", + "metadata": {}, + "source": [ + "
\n", + "\n", + "##### For ClearScape:\n", + "

Install/update packages - Do this if you have any issues executing the cells in this notebook.

\n", + "

NOTE: If you update the teradatamlwidgets library, you will need to restart the Kernel.

" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4cd01e2e", + "metadata": {}, + "outputs": [], + "source": [ + "#pip --upgrade teradatamlwidgets" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "2ca12836", + "metadata": {}, + "outputs": [], + "source": [ + "#from teradataml import create_context, get_context, remove_context\n", + "#%run -i ../../UseCases/startup.ipynb\n", + "#eng = create_context(host = 'host.docker.internal', username='demo_user', password = password)\n", + "#print(eng)" + ] + }, + { + "cell_type": "markdown", + "id": "175ba860", + "metadata": {}, + "source": [ + "
" + ] + }, + { + "cell_type": "markdown", + "id": "801b06ad", + "metadata": {}, + "source": [ + "Execute the next cell to open a Database Login Widget.
\n", + "Use these values:
\n", + "- **Host** = 'host.docker.internal'
\n", + "- **Username** ='demo_user'
\n", + "- **Password** = the password you entered when you created this environment.
\n", + "\n", + "Then click the **Login** button." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6eb48ee4", + "metadata": {}, + "outputs": [], + "source": [ + "from teradatamlwidgets import login\n", + "ui = login.Ui(default_database='host.docker.internal')" + ] + }, + { + "cell_type": "markdown", + "id": "84f2ea00", + "metadata": { + "tags": [] + }, + "source": [ + "## Setting up the test train data\n", + "\n", + "### Load Data and DataFrame Objects\n", + "\n", + "In this example we will load some tables using teradataml. As we have already logged in, we can call teradataml load functions:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d9e67adf", + "metadata": {}, + "outputs": [], + "source": [ + "from teradataml import load_example_data\n", + "from teradataml import DataFrame\n", + "load_example_data(\"teradataml\", \"iris_input\")\n", + "iris = DataFrame.from_table(\"iris_input\")\n", + "iris.head()" + ] + }, + { + "cell_type": "markdown", + "id": "ae6c0466", + "metadata": {}, + "source": [ + "### Test Train Split" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4fc773e0", + "metadata": {}, + "outputs": [], + "source": [ + "iris_sample = iris.sample(frac = [0.8, 0.2])\n", + "# Fetching train and test data\n", + "iris_train= iris_sample[iris_sample['sampleid'] == 1].drop('sampleid', axis=1)\n", + "iris_test = iris_sample[iris_sample['sampleid'] == 2].drop('sampleid', axis=1)\n", + "from teradataml import copy_to_sql\n", + "copy_to_sql(df = iris_train, table_name = \"iris_train\", if_exists=\"replace\")\n", + "copy_to_sql(df = iris_test, table_name = \"iris_test\", if_exists=\"replace\")" + ] + }, + { + "cell_type": "markdown", + "id": "e38b8ba4", + "metadata": { + "tags": [] + }, + "source": [ + "#### Code Explanation\n", + "Below is the basic set up of the notebook with the mandatory parameters\n", + "\n", + "Import the notebook using the code :\n", + "\n", + "`from teradatamlwidgets import auto_ml`\n" + ] + }, + { + "cell_type": "markdown", + "id": "92193dc4", + "metadata": {}, + "source": [ + "
\n", + "\n", + "## Example: AutoML Classification Standalone\n", + "\n", + "Run the cell to open the AutoML UI. Then follow these steps:\n", + "\n", + "1. In Initialize tab, set Target Column to species and click Fit\n", + "2. After Fit has completed, click the Close button.\n", + "3. Then click Leaderboard to see the top leaders. Then click the Close button.\n", + "4. In the Prediction tab, click Execute to calculate the predicted output. Clicking the Close button will return you to the AutoML configuration options." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "9076b46b", + "metadata": {}, + "outputs": [], + "source": [ + "from teradatamlwidgets import auto_ml\n", + "\n", + "ui = auto_ml.Ui(\n", + " task=\"Classification\", \n", + " training_table=iris_train, \n", + " predict_table='iris_input',\n", + " algorithms=['xgboost', 'knn'],\n", + " verbose=0,\n", + " max_runtime_secs=300,\n", + " max_models=5)" + ] + }, + { + "cell_type": "markdown", + "id": "36a451b3", + "metadata": {}, + "source": [ + "
\n", + "\n", + "## Predict Dataframe\n", + "In order to access the predicted output table, run the below command:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "80e0cb77", + "metadata": {}, + "outputs": [], + "source": [ + "ui.get_prediction_dataframe()" + ] + }, + { + "cell_type": "markdown", + "id": "f6b66670", + "metadata": {}, + "source": [ + "
\n", + "\n", + "## Leaderboard\n", + "In order to access the leaderboard, run the below command:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "2b62c5a3", + "metadata": {}, + "outputs": [], + "source": [ + "ui.get_leaderboard()" + ] + }, + { + "cell_type": "markdown", + "id": "617ca0b6", + "metadata": {}, + "source": [ + "
\n", + "\n", + "## AutoML instance\n", + "In order to access the AutoML instance, run the below command:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "09fa057c", + "metadata": {}, + "outputs": [], + "source": [ + "ui.get_auto_ml()" + ] + }, + { + "cell_type": "markdown", + "id": "bc345b04-3c59-43db-855b-1d2ddd254883", + "metadata": { + "tags": [] + }, + "source": [ + "## Example: AutoML Classification Integrated Function UI" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "2aea4135-db32-4731-8fee-dd3153292974", + "metadata": {}, + "outputs": [], + "source": [ + "from teradatamlwidgets import analytic_functions\n", + "\n", + "inputs = ['iris_train']\n", + "ui = analytic_functions.Ui(function=\"AutoML\", inputs=inputs)" + ] + }, + { + "cell_type": "markdown", + "id": "72c55c42-fec8-4fd3-b394-22cfc03da87c", + "metadata": {}, + "source": [ + "" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.14" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/Getting_Started/Teradataml_Widgets/BYOM_Tutorial.ipynb b/Getting_Started/Teradataml_Widgets/BYOM_Tutorial.ipynb new file mode 100644 index 00000000..122abb5e --- /dev/null +++ b/Getting_Started/Teradataml_Widgets/BYOM_Tutorial.ipynb @@ -0,0 +1,273 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "c5f93b60", + "metadata": {}, + "source": [ + "
\n", + "

\n", + " Using BYOM with Teradataml Widgets\n", + "
\n", + " \"Teradata\"\n", + "

\n", + "
" + ] + }, + { + "cell_type": "markdown", + "id": "219acebd", + "metadata": {}, + "source": [ + "

Introduction

\n", + "\n", + "

The Teradataml Widgets (teradatamlwidgets) enhances teradataml’s built-in interaction capabilities with the Teradata Vantage™ Data and Analytics Platform. This provides visual components for scaled, in-Database Analytics with data that you keep in the Teradata Vantage Analytics Database within a notebook.

\n", + "\n", + "

With these components, in this notebook you will be able to:

\n", + "\n", + "" + ] + }, + { + "cell_type": "markdown", + "id": "a013379f-0930-4a81-aafb-17f3aaecb319", + "metadata": {}, + "source": [ + "
\n", + "\n", + "##### For ClearScape:\n", + "

Install/update packages - Do this if you have any issues executing the cells in this notebook.

\n", + "

NOTE: If you update the teradatamlwidgets library, you will need to restart the Kernel.

" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4cd01e2e", + "metadata": {}, + "outputs": [], + "source": [ + "#pip install teradatamlwidgets" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "2ca12836", + "metadata": {}, + "outputs": [], + "source": [ + "#from teradataml import create_context, get_context, remove_context\n", + "#%run -i ../../UseCases/startup.ipynb\n", + "#eng = create_context(host = 'host.docker.internal', username='demo_user', password = password)\n", + "#print(eng)" + ] + }, + { + "cell_type": "markdown", + "id": "175ba860", + "metadata": {}, + "source": [ + "
" + ] + }, + { + "cell_type": "markdown", + "id": "801b06ad", + "metadata": {}, + "source": [ + "#### After Running the Cell\n", + "\n", + "After running the notebook cell below, a login screen shows. This login screen has a place for the Host, Username and Password.\n", + "\n", + "Once you type in this information, click Login." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6eb48ee4", + "metadata": {}, + "outputs": [], + "source": [ + "from teradatamlwidgets import login\n", + "\n", + "ui = login.Ui(default_database='host.docker.internal')" + ] + }, + { + "cell_type": "markdown", + "id": "84f2ea00", + "metadata": {}, + "source": [ + "
\n", + "\n", + "## Setting up the Model\n", + "\n", + "### Load Data and DataFrame Objects\n", + "\n", + "In this example we will load some tables using teradataml. As we have already logged in, we can call teradataml load functions:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d9e67adf", + "metadata": {}, + "outputs": [], + "source": [ + "# Import required libraries / functions.\n", + "import os, teradataml\n", + "from teradataml import get_connection, DataFrame\n", + "from teradataml import save_byom, retrieve_byom, load_example_data\n", + "from teradataml import configure, display_analytic_functions, execute_sql\n", + "\n", + "# Load example data.\n", + "load_example_data(\"byom\", \"iris_test\")\n", + "\n", + "# Create teradataml DataFrame objects.\n", + "iris_test = DataFrame.from_table(\"iris_test\")" + ] + }, + { + "cell_type": "markdown", + "id": "ae6c0466", + "metadata": {}, + "source": [ + "### Load Model into Vantage" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4fc773e0", + "metadata": {}, + "outputs": [], + "source": [ + "# Load model file into Vantage.\n", + "model_file = os.path.join(os.path.dirname(teradataml.__file__), \"data\", \"models\", \"dr_iris_rf\")\n", + "try: \n", + " save_byom(\"dr_iris_rf\", model_file, \"byom_models\")\n", + "except:\n", + " pass" + ] + }, + { + "cell_type": "markdown", + "id": "e38b8ba4", + "metadata": {}, + "source": [ + "#### Code Explanation\n", + "Below is the basic set up of the notebook with the mandatory parameters\n", + "\n", + "Import the notebook using the code :\n", + "\n", + "`from teradatamlwidgets import byom_functions`" + ] + }, + { + "cell_type": "markdown", + "id": "92193dc4", + "metadata": {}, + "source": [ + "
\n", + "\n", + "## Example: DataRobotPredict" + ] + }, + { + "cell_type": "markdown", + "id": "4c279e9d", + "metadata": {}, + "source": [ + "Score data in Vantage with a model that has been created outside the Vantage by removing all the all cached models.\n", + "\n", + "The parameters for the DataRobotPredict example are as follows:\n", + "\n", + "- Set Accumulate to 'id', 'sepal_length', 'petal_length'\n", + "- Set OverwriteCachedModel to \"*\"\n", + " \n", + "- Function is already set to DataRobotPredict\n", + "- BYOM Install is already set to \"mldb\"\n", + "- InputTable is already set to \"iris_test\"\n", + "- ModelID is already set to \"dr_iris_rf\"\n", + "- ModelTable is already set to \"byom_models\"" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "9076b46b", + "metadata": {}, + "outputs": [], + "source": [ + "from teradatamlwidgets import byom_functions\n", + "\n", + "ui = byom_functions.Ui(function = \"DataRobotPredict\", \n", + " byom_location = \"mldb\", \n", + " data=\"iris_test\", \n", + " model_id=\"dr_iris_rf\", \n", + " model_table=\"byom_models\")" + ] + }, + { + "cell_type": "markdown", + "id": "36a451b3", + "metadata": {}, + "source": [ + "
\n", + "\n", + "## Output Dataframe\n", + "In order to access the full output table, run the below command:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "80e0cb77", + "metadata": {}, + "outputs": [], + "source": [ + "ui.get_output_dataframe()" + ] + }, + { + "cell_type": "markdown", + "id": "639eaaa2-5228-411c-afd5-cc90cfecf051", + "metadata": {}, + "source": [ + "" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.14" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/Getting_Started/Teradataml_Widgets/EDA.ipynb b/Getting_Started/Teradataml_Widgets/EDA.ipynb new file mode 100644 index 00000000..30a2ff87 --- /dev/null +++ b/Getting_Started/Teradataml_Widgets/EDA.ipynb @@ -0,0 +1,409 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "5005165d-678a-4967-aee6-74f0c3ca4e80", + "metadata": {}, + "source": [ + "
\n", + "

\n", + " Exploring Data Analysis using Teradataml Widgets\n", + "
\n", + " \"Teradata\"\n", + "

\n", + "
" + ] + }, + { + "cell_type": "markdown", + "id": "4ec1d34c-1610-4ce1-8e89-b56cfeb83090", + "metadata": {}, + "source": [ + "

Introduction

\n", + "\n", + "

The Exploratory Data Analysis (EDA) UI allows the user to take a deeper look into their dataset. The tabs include Data, Analyze, Visualize, Describe, and Persist.\n", + "This provides visual components for scaled, in-Database Analytics with data that you keep in the Teradata Vantage Analytics Database within a notebook.

\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "" + ] + }, + { + "cell_type": "markdown", + "id": "f26f5e9c-041e-4903-98a8-cb5ccd81072d", + "metadata": {}, + "source": [ + "#### Code Explanation\n", + "Below is the basic set up of the notebook to call the EDA UI:\n", + "1. Create Connection\n", + " - Use these values: host = 'host.docker.internal'\n", + " - username='demo_user'\n", + " - password = the password you entered when you created this environment.\n", + "

\n", + "2. Call the DataFrame" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "fec0b072-430d-4637-a8c3-191af6752b25", + "metadata": {}, + "outputs": [], + "source": [ + "from teradataml import *\n", + "import getpass\n", + "conn = create_context(host=getpass.getpass(\"Hostname: \"), \n", + " username=getpass.getpass(\"Username: \"),\n", + " password=getpass.getpass(\"Password: \"))" + ] + }, + { + "cell_type": "markdown", + "id": "7f388904-2a84-4705-baec-45940d93e1a3", + "metadata": {}, + "source": [ + "### Load Tables\n", + "\n", + "In this example we will load some tables using teradataml. As we have already logged in, we can call teradataml load functions:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6eb7821d-fc71-437a-8266-e842fce9b385", + "metadata": {}, + "outputs": [], + "source": [ + "# Load the example data.\n", + "load_example_data(\"movavg\", \"ibm_stock\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "1a0c9501-de14-4131-a1a9-6e8f8ab973e2", + "metadata": {}, + "outputs": [], + "source": [ + "load_example_data(\"customer\")" + ] + }, + { + "cell_type": "markdown", + "id": "0aeb5a95-f932-4837-a0e8-49bd42b42cc8", + "metadata": {}, + "source": [ + "### Set the DataFrame\n", + "By setting the DataFrame and calling it, the EDA UI will appear with the 5 different tabs (Data, Describe, Visualize, Analyze, and Persist)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3af4dd05-7896-4153-9b5a-64993ae203d6", + "metadata": {}, + "outputs": [], + "source": [ + "df = DataFrame(\"ibm_stock\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "0a64c980-8193-4dd7-8d5a-5e98c1479112", + "metadata": {}, + "outputs": [], + "source": [ + "df" + ] + }, + { + "cell_type": "markdown", + "id": "06d098e1-7c4e-4fae-a5e4-36d0d38a4896", + "metadata": {}, + "source": [ + "### Pipeline Example" + ] + }, + { + "cell_type": "markdown", + "id": "6594bb1e-6523-4eb1-9380-d95de564f8b9", + "metadata": {}, + "source": [ + "Pipelining is possible within the Analyze tab. \n", + "1. Click on the **Analyze** tab\n", + "2. Type in and choose \"Linear Regression\"\n", + "3. Under Required tab for *Input Columns* select \"age\", \"years_with_bank\", and \"nbr_children\". For *Response Column* choose \"income\"\n", + "4. Click **Execute** button\n", + "5. Click **Add to Pipeline** button\n", + "6. Type in and choose \"Linear Regression Predict\"\n", + "7. Under Optional tab for *Index Columns* select \"age\", \"years_with_bank\". For *Response Column* choose \"income\". For *Acumulate* choose \"nbr_children\"\n", + "8. Click **Execute** button" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "2b806fc4-f9a9-4b09-b1eb-4534b5734553", + "metadata": {}, + "outputs": [], + "source": [ + "df = DataFrame(\"Customer\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "328572f6-2d57-4559-9c9c-16e5d6287f1a", + "metadata": {}, + "outputs": [], + "source": [ + "df" + ] + }, + { + "cell_type": "markdown", + "id": "7ef92fe0-9e15-4f1c-a5f1-fc375705128a", + "metadata": {}, + "source": [ + "### Analyze Tab - Access Output DataFrame\n", + "After executing in the Analyze, it is possible to access the last output dataframe by calling the method below." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a3b459a6-50b1-49af-9650-13c2f2e19cef", + "metadata": {}, + "outputs": [], + "source": [ + "df.get_output()" + ] + }, + { + "cell_type": "markdown", + "id": "690e4990-1461-4e8f-9573-f8e5d9ea55aa", + "metadata": {}, + "source": [ + "### Also possible with query input" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d0b56ba2-fd89-4683-b9bd-b0af9897b37b", + "metadata": {}, + "outputs": [], + "source": [ + "df = DataFrame(\"customer\")\n", + "if True:\n", + " for i in range(30):\n", + " df = df.assign(**{\"column_long_xyz_{}\".format(i): i})\n", + "df" + ] + }, + { + "cell_type": "markdown", + "id": "cb1dc9b3-f244-40dd-aac0-2a646bc956a3", + "metadata": {}, + "source": [ + "## Approach 2\n", + "The EDA UI is also callable within teradatamlwidgets, using `from teradatamlwidgets import eda`\n", + "\n", + "#### Code Explanation\n", + "Below is the basic set up of the notebook to call the EDA UI :\n", + "1. Create a connection using teradataml create_context or log in using\n", + "\n", + " `from teradatamlwidgets import login`\n", + "\n", + " `ui = login.Ui(val_location=\"VAL\")`\n", + "\n", + "3. Import the notebook using the code : \n", + "\n", + " `from teradatamlwidgets import eda`\n", + "\n", + "4. Set up the input DataFrame and call code using :\n", + " \n", + " `ui = eda.Ui(df = df)`" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "82ede802-edf1-4786-bed1-3c348e7d86d1", + "metadata": {}, + "outputs": [], + "source": [ + "from teradatamlwidgets import eda\n", + "\n", + "df = DataFrame(\"ibm_stock\")\n", + "\n", + "ui = eda.Ui(df = df)" + ] + }, + { + "cell_type": "markdown", + "id": "0f66716a-f186-4503-8f51-b73f39919482", + "metadata": {}, + "source": [ + "##### Each tab from the EDA UI is also callable separately within teradatamlwidgets.\n", + "\n", + "Analyze: `from teradatamlwidgets import analytic_functions` \n", + "\n", + "Visualize: `from teradatamlwidgets import plot` \n", + "\n", + "Describe: `from teradatamlwidgets import describe` \n", + "\n", + "Persist: `from teradatamlwidgets import persist` \n", + "\n", + "\n", + "*The explanation for **Analyze** (analytic_functions) is available in separate notebook \"Tutorial.ipynb\".*\n", + "\n", + "*The explanation for **Visualize** (plot) is available in separate notebook \"Plot.ipynb\".*\n", + "\n", + "\n", + "## Describe\n", + "#### Code Explanation\n", + "Below is the basic set up of the notebook to call the Describe UI :\n", + "1. Create a connection using teradataml create_context or log in using\n", + " \n", + " `from teradatamlwidgets import login`\n", + "\n", + " `ui = login.Ui(val_location=\"VAL\")`\n", + "\n", + "3. Import the notebook using the code : \n", + "\n", + " `from teradatamlwidgets import describe`\n", + "\n", + "4. Set up the input DataFrame and call code using :\n", + " \n", + " `ui = describe.Ui(df = df)`" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5445f1ad-06c7-48b9-81cd-b1e3159e2e17", + "metadata": {}, + "outputs": [], + "source": [ + "from teradatamlwidgets import describe\n", + "\n", + "df = DataFrame(\"ibm_stock\")\n", + "\n", + "ui = describe.Ui(df = df)" + ] + }, + { + "cell_type": "markdown", + "id": "9aa09f9c-6f76-4465-b0e4-f4c32720c323", + "metadata": {}, + "source": [ + "## Persist\n", + "#### Code Explanation\n", + "Below is the basic set up of the notebook to call the Persist UI :\n", + "1. Create a connection using teradataml create_context or log in using\n", + " \n", + " `from teradatamlwidgets import login`\n", + "\n", + " `ui = login.Ui(val_location=\"VAL\")`\n", + "\n", + "3. Import the notebook using the code : \n", + "\n", + " `from teradatamlwidgets import persist`\n", + "\n", + "4. Set up the input DataFrame and call code using :\n", + " \n", + " `ui = persist.Ui(df = df)`" + ] + }, + { + "cell_type": "markdown", + "id": "f9f6e364-0a3c-4e3d-8e4d-4aa955b4147c", + "metadata": {}, + "source": [ + "## Close the Database Connection\n", + "\n", + "When you have finished executing these teradataml features, please close your connection to the database." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4ab52689-8842-496e-9d9a-b9d37301a0c1", + "metadata": {}, + "outputs": [], + "source": [ + "remove_context()" + ] + }, + { + "cell_type": "markdown", + "id": "2e6cf1ff-2ff7-4e96-bee4-a632cf7167d2", + "metadata": {}, + "source": [ + "" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.14" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/Getting_Started/Teradataml_Widgets/Plot_Notebook.ipynb b/Getting_Started/Teradataml_Widgets/Plot_Notebook.ipynb index ec1eef92..3124530f 100644 --- a/Getting_Started/Teradataml_Widgets/Plot_Notebook.ipynb +++ b/Getting_Started/Teradataml_Widgets/Plot_Notebook.ipynb @@ -2,12 +2,12 @@ "cells": [ { "cell_type": "markdown", - "id": "79fd4b3c-6c41-484f-93be-248f6330f552", + "id": "de32dad3", "metadata": {}, "source": [ "
\n", "

\n", - " Plotting with Teradataml Widgets\n", + " Plotting Datasets with Teradataml Widgets\n", "
\n", " \"Teradata\"\n", "

\n", @@ -16,97 +16,85 @@ }, { "cell_type": "markdown", - "id": "47369844-f8f9-41c9-bb1f-21e8ab3862a2", + "id": "ae1f9877", "metadata": {}, "source": [ - "

Introduction

\n", + "

Introduction

\n", "\n", - "

The Teradataml Widgets (teradatamlwidgets) enhances teradataml’s built-in interaction capabilities with the Teradata Vantage™ Data and Analytics Platform. This provides visual components for scaled, in-Database Analytics with data that you keep in the Teradata Vantage Analytics Database within a notebook.

\n", + "

The Teradataml Widgets (teradatamlwidgets) enhances teradataml’s built-in interaction capabilities with the Teradata Vantage™ Data and Analytics Platform. This provides visual components for scaled, in-Database Analytics with data that you keep in the Teradata Vantage Analytics Database within a notebook.

\n", "\n", - "

With these components, in a notebook you will be able to:

\n", + "

With these components, in this notebook you will be able to:

\n", "\n", - "