Generative AI: Answering Your Frequently Asked Questions

Prag Jaodekar

Director, Synechron ,

Artificial Intelligence

Generative Artificial Intelligence (GenAI) holds enormous transformational potential for businesses. This will likely include improvements in productivity, enhanced customer services, and rapid development of innovations geared toward increasing a firm’s competitive advantage. At the heart of GenAI are Large Language Models (LLMs), deep learning algorithms that can perform a variety of natural language processing (NLP) tasks, enhancing Chatbot applications, such as Google’s Bard and OpenAI’s ChatGPT.

But, business leaders need to carefully balance GenAI innovation with safety. The operational use of GenAI raises a number of questions related to security, compliance and ethics — and these should be significant considerations for any businesses when considering their GenAI strategies and frameworks.

We’ve provided answers to some of the most commonly asked GenAI questions:

Question #1: How do we choose the right LLM for our GenAI needs?

Answer #1: Key considerations for selecting a foundational LLM model should include:

Type of Task – You should choose your model based on the type of task: text-to-text, text-to-speech, etc.
Cost – LLM costs often align with the model size. Once your Minimum Viable Product is validated and specific requirements are understood, it can be cost-effective to transition to a smaller model.
Training – Choosing between pre-trained and instruct-trained models depends on the nature of the task and level of control/freedom you wish your model to have.
Performance – If your task is highly specific or niche, a smaller, specialized and fine-tuned model may be the best choice, e.g., migrating a proprietary language to something open source, like Java, or analyzing production logs specific to a particular organization (to predict issue resolution).
Responsible AI – Ensure the model you’re choosing addresses accuracy, fairness, intellectual property considerations, toxicity and privacy, via end-user education on filtering and technical concepts — like watermarking and differential privacy.

Question #2: What are the potential enterprise deployment strategies for GenAI?

Answer #2: There are two options, each with its own advantages and disadvantages:

Deploy ‘in house’ models on your own public cloud or private infrastructure — This option offers greater control and flexibility but comes with associated overheads of managing the model, infrastructure and talent pool. However, this could end up being a worthwhile investment in terms of fostering creativity and innovation, and has the potential to unlock value and competitive advantage (by staging your own organizational data alongside it).
Use the ‘As-a-Service’ model (like Chat GPT enterprise, available via API) — Here the setup time is minimal, and you’ll pay per query (token-based billing where 1000 Token ~ 750 words); you’ll also have limited infrastructure overheads. This does however operate with limited context (32K available on GPT-4), which means it can only run on the publically available information the model is trained on, so you’ll only have limited ability to train it on your own organization’s data.

In conclusion then, it’s probably a good idea to use 'As-a-Service' model during the PoC/PoV phase — to refine the use-case — then switch over to a specialized ‘in house model’ during implementation.

Question #3: What security issues do we need to understand when considering the use of GenAI in enterprise applications?

Answer #3: There are several issues to consider here:

For ‘in-house’ models, security should be designed following a ‘zero-trust’ model, with specific access control in line with the enterprise cloud service’s security policy, AI control and compliance policy.
For ‘As-a-Service’ models like GPT for Enterprise (GPT 4 and GPT 3.5), the security risk originates from the exchange of information across the internet and is protected via data encryption, at rest, and in transit.

Other security vulnerabilities may stem from code generation use cases. Enterprises should ensure that machine-generated code goes through the same rigorous static and dynamic code assessments, and unit and integration test cases, along with mandatory manual reviews, following a human-centric approach to AI — where humans are in ‘co-pilot mode’ to ensure security and compliance at every stage.

Question #4: Can using GenAI expose corporate confidential or Personally Identifiable Information (PII) on the internet?

Answer #4: This is relevant for the ‘As-a-Service’ model use. Businesses need to ensure that enterprise solutions are compliant with data privacy regulations across the globe, and that enterprise offerings, such as ChatGPT Enterprise, are SOC2 compliant.

In addition, alternative data perturbation techniques can be employed to utilize LLM service offerings, without compromising PII data. You can learn more about these in our Synechron research paper ‘Life of PII – A PII Obfuscation Transformer’.

Question #5: Will GenAI be trained on our corporate database?

Answer #5: Most of the commercially available generative AI solutions, including OpenAI’s ChatGPT Enterprise versions, explicitly state that a client’s data and interactions with the LLM model are not used to further train the model. Typically, they also offer configurations to turn-off data collection at various levels. However, some degree of telemetry/usage data is collected for operational purposes (billing, admin and troubleshooting, etc.). You can learn more about this through their ‘Trust Portals’.

Question #6: How can we ensure that data is controlled so that people only have access where applicable?

Answer #6: Businesses should utilize solutions that offer specific, role-based access control with single sign-on/multi-factor authentication, or the flexibility to design and integrate them in accordance with their own data access and governance policies.

Question #7: How can we avoid vendor lock-in with our LLM implementation?

Answer #7: As outlined above, a good strategy is to consider early stage LLM proofs of concept for ‘As-a-Service’ models, with only a modest up-front investment. This means that later, during the implementation phase, deployment can be done entirely on enterprise cloud, leveraging the open-source model and components — rather than the cloud-native model. Open-source tech stacks for LLMs are maturing quickly and provide a wide variety of toolsets to choose from.

The most important thing to remember is that all LLM solutions are fueled by enterprise data. This means that a business’ GenAI solution should always be designed with the enterprise data strategy in mind.

Question #8: AI regulation is key area of concern, how can we ensure that any AI systems being designed is compliant with all the latest legislation?

Answer #8: Global AI regulation is rapidly evolving , with the US Government issuing an Executive Order on AI in October 2023 and the UK recently announcing the formation of their AI Safety Institute. Organizations must now ensure that they’re designing responsible AI systems; MAS outlines these principles of responsible AI system design as ‘Fairness’, ‘Ethics’, ‘Accountability’ and ‘Transparency’.

With an eye on the high cost of retrospective compliance, organizations must ensure that they have a clear AI policy and governance framework to provide the necessary guardrails for AI system design (without stifling the spirit of innovation). Synechron’s Regulatory Implementation Optimizer accelerator imports AI regulation from across the globe and drafts a policy framework and implementation roadmap – helping you to rapidly design a dynamic AI Policy and Governance framework.

Please contact us to learn more about Synechron’s AI Ethics, Safety and Security services.