validatedpatterns
diff --git a/‎content/patterns/mlops-fraud-detection/_index.adoc‎
Lines changed: 36 additions & 0 deletions b/‎content/patterns/mlops-fraud-detection/_index.adoc‎
Lines changed: 36 additions & 0 deletions
diff --git a/‎content/patterns/mlops-fraud-detection/mfd-getting-started.adoc‎
Lines changed: 17 additions & 0 deletions b/‎content/patterns/mlops-fraud-detection/mfd-getting-started.adoc‎
Lines changed: 17 additions & 0 deletions
diff --git a/‎modules/edd-deploying-edd-pattern.adoc‎
Lines changed: 1 addition & 1 deletion b/‎modules/edd-deploying-edd-pattern.adoc‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎modules/mfd-about-mlops-fraud-detection.adoc‎
Lines changed: 39 additions & 0 deletions b/‎modules/mfd-about-mlops-fraud-detection.adoc‎
Lines changed: 39 additions & 0 deletions
diff --git a/‎modules/mfd-architecture.adoc‎
Lines changed: 53 additions & 0 deletions b/‎modules/mfd-architecture.adoc‎
Lines changed: 53 additions & 0 deletions
diff --git a/‎modules/mfd-deploying-mfd-pattern.adoc‎
Lines changed: 194 additions & 0 deletions b/‎modules/mfd-deploying-mfd-pattern.adoc‎
Lines changed: 194 additions & 0 deletions
@@ -0,0 +1,36 @@
+---
+title: MLOps Fraud Detection 
+date: 2023-11-12
+validated: false
+summary: This pattern demonstrates how Red Hat OpenShift AI and MLFlow can be used together to build an end-to-end MLOps platform. It demonstrates this using a credit card fraud detection use case. 
+products:
+- Red Hat OpenShift Container Platform
+- Red Hat OpenShift AI
+- Red Hat Open Data Foundation
+industries:
+- financial services
+aliases: /mlops-fraud-detection/
+pattern_logo: mlops-fraud-detection.png
+links:
+  install: mfd-getting-started
+  arch: https://www.redhat.com/architect/portfolio/architecturedetail?ppid=6
+  help: https://groups.google.com/g/validatedpatterns
+  bugs: https://github.com/arslankhanali/mlops-fraud-detection/issues
+ci: mfd
+contributor:
+  name: Arslan Khan
+  contact: mailto:arskhan@redhat.com
+  git: https://github.com/arslankhanali
+---
+:toc:
+:imagesdir: /images
+:_content-type: ASSEMBLY
+
+include::modules/mfd-about-mlops-fraud-detection.adoc[leveloffset=+1]
+
+include::modules/mfd-architecture.adoc[leveloffset=+1]
+
+[id="next-steps_mfd-index"]
+== Next steps
+
+* link:mfd-getting-started[Deploy the management hub] using Helm.
@@ -0,0 +1,17 @@
+---
+title: Getting started
+weight: 10
+aliases: /mlops-fraud-detection/getting-started/
+---
+:toc:
+:imagesdir: /images
+:_content-type: ASSEMBLY
+
+include::modules/mfd-deploying-mfd-pattern.adoc[leveloffset=1]
+
+include::modules/mfd-using-mfd-pattern.adoc[leveloffset=1]
+
+= Next Steps
+
+link:https://groups.google.com/g/hybrid-cloud-patterns[Help & Feedback]
+link:https://github.com/validatedpatterns/mlops-fraud-detection/issues[Report Bugs]
@@ -145,7 +145,7 @@ Emerging Disease Detection Validated Pattern.
 
 You can run `make predeploy` to check your values. This will allow you to review your values and changed them in
 the case there are typos or old values.  The values files that should be reviewed prior to deploying the
-Medical Diagnosis Validated Pattern are:
+Emerging Disease Detection Validated Pattern are:
 
 |===
 | Values File | Description
 
@@ -0,0 +1,39 @@
+:_content-type: CONCEPT
+:imagesdir: ../../images
+
+[id="about-mlops-fraud-detection-pattern"]
+= About the MLOps Fraud Detection
+
+MLOps Credit Card Fraud Detection use case::
+* Build and train models in RHODS to detect credit card fraud
+* Track and store those models with MLFlow
+* Serve a model stored in MLFlow using RHODS Model Serving (or MLFlow serving)
+* Deploy a model application in OpenShift that runs sends data to the served model and displays the prediction
+
++
+Background::
+AI technology is already transforming the financial services industry. AI models can be used to make rapid inferences that benefit the FS institute and its customers. This pattern deploys a AI model to detect fraud on crdit card transactions
+
+[id="about-solution"]
+== About the solution
+
+The model is built on a Credit Card Fraud Detection model, which predicts if a credit card usage is fraudulent or not depending on a few parameters such as: distance from home and last transaction, purchase price compared to median, if it's from a retailer that already has been purchased from before, if the PIN number is used and if it's an online order or not.
+
+== Technology Highlights:
+* Event-Driven Architecture
+* Data Science on OpenShift
+* Model registry using MLFlow
+
+== Solution Discussion
+
+This architecture pattern demonstrates four strengths:
+
+* *Real-Time Processing*: Analyze transactions in real-time, quickly identifying and flagging potentially fraudulent activities. This speed is crucial in preventing unauthorized transactions before they are completed.
+* *Pattern Recognition*: Detect patterns and anomalies in data and learn from historical transaction data to identify typical spending patterns of a cardholder and flag transactions that deviate from these patterns.
+* *Cost Efficiency*: By automating the detection process, AI reduces the need for extensive manual review of transactions, which can be time-consuming and costly.
+* *Flexibility and Agility*: An cloud native architecture that supports the use of microservices, containers, and serverless computing, allowing for more flexible and agile development and deployment of AI models. This means faster iteration and deployment of new fraud detection algorithms.
+
+// video link to a presentation on the use case
+.Overview of the solution in credit card fraud detection
+* video link coming soon
+// video::VHjpKIeviFE[youtube]
@@ -0,0 +1,53 @@
+:_content-type: CONCEPT
+:imagesdir: ../../images
+
+[id="overview-architecture"]
+== Overview of the Architecture
+
+Description of each component:
+
+* *Data Set*: The data set contains the data used for training and evaluating the model we will build in this demo.
+* *RHODS Notebook*: We will build and train the model using a Jupyter Notebook running in RHODS.
+* *MLFlow Experiment tracking*: We use MLFlow to track the parameters and metrics (such as accuracy, loss, etc) of a model training run. These runs can be grouped under different "experiments", making it easy to keep track of the runs.
+* *MLFlow Model registry*: As we track the experiment we also store the trained model through MLFlow so we can easily version it and assign a stage to it (for example Staging, Production, Archive).
+* *S3 (ODF)*: This is where the models are stored and what the MLFlow model registry interfaces with. We use ODF (OpenShift Data Foundation) according to the MLFlow guide, but it can be replaced with another solution.
+* *RHODS Model Serving*: We recommend using RHODS Model Serving for serving the model. It's based on ModelMesh and allows us to easily send requests to an endpoint for getting predictions.
+* *Application interface*: This is the interface used to run predictions with the model. In our case, we will build a visual interface (interactive app) using Gradio and let it load the model from the MLFlow model registry.
+
+//figure 1 originally
+.Overview of the solution reference architecture
+image::mlops-fraud-detection/mfd-reference-architecture.png[link="/images/mlops-fraud-detection/mfd-reference-architecture.png"]
+
+//figure 2 logical
+.Logical Architecture
+//image::mlops-fraud-detection/mfd-logical-architecture-legend.png[link="/images/mlops-fraud-detection/mfd-logical-architecture-legend.png", width=940]
+
+//figure 3 Schema
+.Data Flow Architecture
+//image::mlops-fraud-detection/mfd-schema-dataflow.png[link="/images/mlops-fraud-detection/mfd-schema-dataflow.png", width=940]
+
+[id="about-technology"]
+== About the technology
+
+The following technologies are used in this solution:
+
+https://www.redhat.com/en/technologies/cloud-computing/openshift/try-it[Red Hat OpenShift Platform]::
+An enterprise-ready Kubernetes container platform built for an open hybrid cloud strategy. It provides a consistent application platform to manage hybrid cloud, public cloud, and edge deployments. It delivers a complete application platform for both traditional and cloud-native applications, allowing them to run anywhere. OpenShift has a pre-configured, pre-installed, and self-updating monitoring stack that provides monitoring for core platform components. It also enables the use of external secret management systems, for example, HashiCorp Vault in this case, to securely add secrets into the OpenShift platform.
+
+https://www.redhat.com/en/technologies/cloud-computing/openshift/openshift-ai[Red Hat OpenShift AI]::
+Red Hat® OpenShift® AI is an AI-focused portfolio that provides tools to train, tune, serve, monitor, and manage AI/ML experiments and models on Red Hat OpenShift. Bring data scientists, developers, and IT together on a unified platform to deliver AI-enabled applications faster.
+
+https://www.redhat.com/en/technologies/cloud-computing/openshift/try-it[Red Hat OpenShift GitOps]::
+A declarative application continuous delivery tool for Kubernetes based on the ArgoCD project. Application definitions, configurations, and environments are declarative and version controlled in Git. It can automatically push the desired application state into a cluster, quickly find out if the application state is in sync with the desired state, and manage applications in multi-cluster environments.
+
+https://www.redhat.com/en/technologies/jboss-middleware/amq[Red Hat AMQ Streams]::
+Red Hat AMQ streams is a massively scalable, distributed, and high-performance data streaming platform based on the Apache Kafka project. It offers a distributed backbone that allows microservices and other applications to share data with high throughput and low latency. Red Hat AMQ Streams is available in the Red Hat AMQ product.
+
+Hashicorp Vault (community)::
+Provides a secure centralized store for dynamic infrastructure and applications across clusters, including over low-trust networks between clouds and data centers.
+
+MLFlow Model Registry (community)::
+A centralized model store, set of APIs, and UI, to collaboratively manage the full lifecycle of an MLflow Model. It provides model lineage (which MLflow experiment and run produced the model), model versioning, model aliasing, model tagging, and annotations.
+
+Other::
+This solution also uses a variety of _observability tools_ including the Prometheus monitoring and Grafana dashboard that are integrated with OpenShift as well as components of the Observatorium meta-project which includes Thanos and the Loki API.
@@ -0,0 +1,194 @@
+:_content-type: PROCEDURE
+:imagesdir: ../../../images
+
+[id="deploying-edd-pattern"]
+= Deploying the MLOps Fraud Detection pattern
+
+== Prerequisites
+
+. An OpenShift cluster (Go to https://console.redhat.com/openshift/create[the OpenShift console]). Cluster must have a dynamic StorageClass to provision PersistentVolumes. 
+// See also link:../../mlops-fraud-detection/cluster-sizing[sizing your cluster].
+. A GitHub account (and a token for it with repositories permissions, to read from and write to your forks)
+
+For installation tooling dependencies, see link:https://validatedpatterns.io/learn/quickstart/[Patterns quick start].
+
+The use of this pattern depends on having a Red Hat OpenShift cluster. In this version of the validated pattern
+there is no dedicated Hub / Edge cluster for the *MLOps Fraud Detection* pattern. This single node pattern can be extend as a managed cluster(s) to a central hub. 
+// See link:../../mlops-fraud-detection/ideas-for-customization[ideas for customization.]
+
+If you do not have a running Red Hat OpenShift cluster you can start one on a
+public or private cloud by using link:https://console.redhat.com/openshift/create[Red Hat's cloud service].
+
+[id="utilities"]
+= Utilities
+
+A number of utilities have been built by the validated patterns team to lower the barrier to entry for using the community or Red Hat Validated Patterns. To use these utilities you will need to export some environment variables for your cloud provider:
+
+[id="preparation"]
+= Preparation
+
+. Fork the link:https://github.com/validatedpatterns/mlops-fraud-detection[mlops-fraud-detection] repo on GitHub. It is necessary to fork because your fork will be updated as part of the GitOps and DevOps processes.
+. Clone the forked copy of this repository.
++
+[,sh]
+----
+git clone git@github.com:<your-username>/mlops-fraud-detection.git
+----
+
+. Create a local copy of the Helm secrets values file that can safely include credentials
++
+*DO NOT COMMIT THIS FILE*
++
+You do not want to push credentials to GitHub.
++
+[,sh]
+----
+cp values-secret-mlops-fraud-detection.yaml.template ~/values-secret.yaml
+vi ~/values-secret.yaml
+----
+
+*values-secret.yaml example*
+
+[source,yaml]
+----
+secrets:
+//Nothing at time of writing.
+----
+
+When you edit the file you can make changes to the various DB and Grafana passwords if you wish.
+
+. Customize the `values-global.yaml` for your deployment
++
+[,sh]
+----
+git checkout -b my-branch
+vi values-global.yaml
+----
+
+*Replace instances of PROVIDE_ with your specific configuration*
+
+[source,yaml]
+----
+global:
+  pattern: mlops-fraud-detection
+  hubClusterDomain: "AUTO" # this is for test only This value is automatically fetched when Invoking against a cluster
+
+  options:
+    useCSV: false
+    syncPolicy: Automatic
+    installPlanApproval: Automatic
+
+main:
+  clusterGroupName: hub
+  gitOpsSpec:
+    operatorChannel: gitops-1.9
+----
+
+[,sh]
+----
+   git add values-global.yaml
+   git commit values-global.yaml
+   git push origin my-branch
+----
+
+. You can deploy the pattern using the link:/infrastructure/using-validated-pattern-operator/[validated pattern operator]. If you do use the operator then skip to Validating the Environment below.
+. Preview the changes that will be made to the Helm charts.
++
+[,sh]
+----
+./pattern.sh make show
+----
+
+. Login to your cluster using oc login or exporting the KUBECONFIG
++
+[,sh]
+----
+oc login
+----
++
+.or set KUBECONFIG to the path to your `kubeconfig` file. For example
++
+[,sh]
+----
+export KUBECONFIG=~/my-ocp-env/auth/kubeconfig
+----
+
+[id="check-the-values-files-before-deployment-getting-started"]
+== Check the values files before deployment
+
+You can run a check before deployment to make sure that you have the required variables to deploy the
+MLOps Fraud Detection Validated Pattern.
+
+You can run `make predeploy` to check your values. This will allow you to review your values and changed them in
+the case there are typos or old values.  The values files that should be reviewed prior to deploying the
+MLOps Fraud Detection Validated Pattern are:
+
+|===
+| Values File | Description
+
+| values-secret.yaml / values-secret-mlops-fraud-detection.yaml
+| This is the values file that will include the rhpam and fhir-psql-db sections with all database et al secrets
+
+| values-global.yaml
+| File that is used to contain all the global values used by Helm
+|===
+
+= Deploy
+
+. Apply the changes to your cluster
++
+[,sh]
+----
+./pattern.sh make install
+----
++
+If the install fails and you go back over the instructions and see what was missed and change it, then run `make update` to continue the installation.
+
+. This takes some time. Especially for the OpenShift Data Foundation operator components to install and synchronize. The `make install` provides some progress updates during the install. It can take up to twenty minutes. Compare your `make install` run progress with the following video showing a successful install.
+
+. Check that the operators have been installed in the UI.
+.. To verify, in the OpenShift Container Platform web console, navigate to *Operators → Installed Operators* page.
+ .. Check that the Operator is installed in the `openshift-operators` namespace and its status is `Succeeded`.
+
+[id="using-openshift-gitops-to-check-on-application-progress-getting-started"]
+== Using OpenShift GitOps to check on Application progress
+
+You can also check on the progress using OpenShift GitOps to check on the various applications deployed.
+
+. Obtain the ArgoCD URLs and passwords.
++
+The URLs and login credentials for ArgoCD change depending on the pattern
+name and the site names they control.  Follow the instructions below to find
+them, however you choose to deploy the pattern.
++
+Display the fully qualified domain names, and matching login credentials, for
+all ArgoCD instances:
++
+[,sh]
+----
+ARGO_CMD=`oc get secrets -A -o jsonpath='{range .items[*]}{"oc get -n "}{.metadata.namespace}{" routes; oc -n "}{.metadata.namespace}{" extract secrets/"}{.metadata.name}{" --to=-\\n"}{end}' | grep gitops-cluster`
+CMD=`echo $ARGO_CMD | sed 's|- oc|-;oc|g'`
+eval $CMD
+----
++
+The result should look something like:
++
+[,text]
+----
+NAME                       HOST/PORT                                                                                      PATH   SERVICES                   PORT    TERMINATION            WILDCARD
+hub-gitops-server   hub-gitops-server-mlops-fraud-detection-hub.apps.mfd-cluster.aws.validatedpatterns.com          hub-gitops-server   https   passthrough/Redirect   None
+# admin.password
+xsyYU6eSWtwniEk1X3jL0c2TGfQgVpDH
+NAME                      HOST/PORT                                                                         PATH   SERVICES                  PORT    TERMINATION            WILDCARD
+cluster                   cluster-openshift-gitops.apps.mfd-cluster.aws.validatedpatterns.com                          cluster                   8080    reencrypt/Allow        None
+kam                       kam-openshift-gitops.apps.mfd-cluster.aws.validatedpatterns.com                              kam                       8443    passthrough/None       None
+openshift-gitops-server   openshift-gitops-server-openshift-gitops.apps.mfd-cluster.aws.validatedpatterns.com          openshift-gitops-server   https   passthrough/Redirect   None
+# admin.password
+FdGgWHsBYkeqOczE3PuRpU1jLn7C2fD6
+----
++
+The most important ArgoCD instance to examine at this point is `mlops-fraud-detection-hub`. This is where all the applications for the pattern can be tracked.
+
+. Check all applications are synchronised. There are thirteen different ArgoCD "applications" deployed as part of this pattern.
+
+