<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>AI - Architect the cloud</title>
	<atom:link href="https://blog.slepcevic.net/tag/ai/feed/" rel="self" type="application/rss+xml" />
	<link>https://blog.slepcevic.net</link>
	<description></description>
	<lastBuildDate>Mon, 06 Oct 2025 09:23:48 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9</generator>
	<item>
		<title>Your own AI coding assistant running on Akamai cloud!</title>
		<link>https://blog.slepcevic.net/your-own-ai-coding-assistant-running-on-akamai-cloud/</link>
					<comments>https://blog.slepcevic.net/your-own-ai-coding-assistant-running-on-akamai-cloud/#respond</comments>
		
		<dc:creator><![CDATA[Alesandro Slepčević]]></dc:creator>
		<pubDate>Tue, 18 Feb 2025 18:17:12 +0000</pubDate>
				<category><![CDATA[Akamai Connected Cloud]]></category>
		<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Cloud]]></category>
		<category><![CDATA[Open Source]]></category>
		<category><![CDATA[Terraform]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[AI coding]]></category>
		<category><![CDATA[bolt.diy]]></category>
		<category><![CDATA[Linode]]></category>
		<category><![CDATA[Linux]]></category>
		<guid isPermaLink="false">https://blog.slepcevic.net/?p=760</guid>

					<description><![CDATA[<p>What? You want some AI to write my code? AI-powered coding assistants are the main talk in the developer world for a while, there&#8217;s no denying that. I can&#8217;t count the times I&#8217;ve read somewhere the AI will replace developers...</p>
<p>The post <a href="https://blog.slepcevic.net/your-own-ai-coding-assistant-running-on-akamai-cloud/">Your own AI coding assistant running on Akamai cloud!</a> first appeared on <a href="https://blog.slepcevic.net">Architect the cloud</a>.</p>]]></description>
										<content:encoded><![CDATA[<h3 class="wp-block-heading">What? You want some AI to write my code?</h3>



<p>AI-powered coding assistants are the main talk in the developer world for a while, there&#8217;s no denying that. I can&#8217;t count the times I&#8217;ve read somewhere the AI will replace developers in the next X years. You’ve probably seen tools like <strong>GitHub Copilot, ChatGPT, or Tabnine</strong> popping up everywhere.</p>



<p>They promise to boost productivity, help with debugging, and even teach you new coding techniques. Sounds amazing, right? But like anything, AI-powered coding assistants have their downsides too. So, let’s talk about what makes them great—and where they might fall short.</p>



<h3 class="wp-block-heading">Why AI Coding Assistants Are a Game-Changer</h3>



<p>Obviously, one of the biggest advantages of using an AI assistant is the time it saves. Instead of writing the same repetitive boilerplate code over and over and over again, you can generate it in seconds. Need a quick function to parse JSON? AI has you covered.  Easy peasy! Stuck on how to structure your SQL query? Just ask. This means less time spent on the boring stuff and more time on actual problem-solving.</p>



<p>AI is also a fantastic debugging tool. It can analyze your code, catch potential issues, and suggest fixes before you even run it. Instead of spending hours combing through error messages and Stack Overflow threads, you get quick, relevant suggestions that help you move forward faster.</p>



<p>And let’s not forget about learning. If you’re picking up a new language or framework, an AI assistant can guide you with real-time examples, explain unfamiliar syntax, and even generate sample projects. It’s like having a 24/7 coding mentor who doesn’t judge your questions.</p>



<p>Beyond just speed and learning, AI can actually help improve code quality. It can suggest best practices, helps format your code, and even recommends refactoring when your code gets messy. Plus, if you’re working in a team, it can assist with keeping code style consistent and even generate useful commit messages or documentation. Wouldn&#8217;t it be cool if we could plug the AI into our pipeline and make sure that all rules are being followed?</p>



<h3 class="wp-block-heading">The downsides no one(everyone) talks about?</h3>



<p>As cool as AI coding assistants are, you don&#8217;t need to be a genius to see that they&#8217;re far from perfect. One of the biggest concerns I personally see is over-reliance. If you’re constantly relying on AI to write your code, do you really understand what’s happening under the hood? This can be a problem when something breaks, and you don’t know how to fix it because you never really wrote the thing in the first place <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f642.png" alt="🙂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> I&#8217;m sure you love reading someone else&#8217;s codebase and debugging that &lt;3</p>



<p>Another issue is that AI-generated code isn’t always optimized or even correct! It might suggest something that works but isn’t efficient, secure, or maintainable. If you blindly accept AI suggestions without reviewing them, you could end up with a mess of inefficient or buggy code.</p>



<p>Then there’s the question of security. AI assistants are trained on huge datasets, and sometimes they can generate code that includes security vulnerabilities. If you’re working on sensitive stuff, you have to be extra careful about what code you’re using and where it’s coming from.</p>



<p>Let&#8217;s talk privacy! Many AI coding tools rely on cloud-based processing, meaning your code might be sent to external servers for analysis. If you’re working on proprietary or confidential code, you need to be aware of the risks and check the privacy policies of the tools you’re using.</p>



<p>And finally, while AI can make you more productive, it can also be a bit of a crutch. Some developers might start relying too much on AI for even basic things, which can slow down their growth and problem-solving skills in the long run. </p>



<h3 class="wp-block-heading">So, Should You Use One?</h3>



<p>AI coding assistants are undeniably powerful tools, but they work best when used wisely. They’re great for boosting productivity, helping with debugging, and learning new technologies—but they shouldn’t replace actual coding knowledge and problem-solving skills. Think of them as a really smart assistant, not a replacement for your own expertise.</p>



<p>If you use AI responsibly—review its suggestions, stay mindful of security risks, and make sure you’re still learning and improving as a developer it can be a fantastic addition to your workflow, just don’t let it do all the thinking for you <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f642.png" alt="🙂" class="wp-smiley" style="height: 1em; max-height: 1em;" /></p>



<h3 class="wp-block-heading">Still interested and want to start using AI in your daily work? Enter bolt.diy <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f642.png" alt="🙂" class="wp-smiley" style="height: 1em; max-height: 1em;" /></h3>



<p><strong>bolt.diy</strong> is the open source version of Bolt.new (previously known as oTToDev and bolt.new ANY LLM), which allows you to choose the LLM that you use for each prompt! Currently, you can use OpenAI, Anthropic, Ollama, OpenRouter, Gemini, LMStudio, Mistral, xAI, HuggingFace, DeepSeek, or Groq models&nbsp;</p>



<figure class="wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio"><div class="wp-block-embed__wrapper">
<div class="responsive-embed widescreen"><iframe title="I Forked Bolt.new and Made it WAY Better" width="1000" height="563" src="https://www.youtube.com/embed/3PFcAu_oU80?list=PLyrg3m7Ei-MpOPKdenkQNcx8ueI36RNrA" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe></div>
</div></figure>



<p><strong>bolt.diy</strong> was originally started by&nbsp;<a href="https://www.youtube.com/@ColeMedin">Cole Medin</a>&nbsp;but has quickly grown into a massive community effort to build the one of the open source AI coding assistants out there.</p>



<h2 class="wp-block-heading">What do I need to get this deployed?</h2>



<p>Well, just Terraform and a Linode account. <br>In the backend we will deploy a VM with a GPU attached, install bolt.diy, ollama and ask it to write some code! Maybe a simple Tic-Tac-Toe game? <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f642.png" alt="🙂" class="wp-smiley" style="height: 1em; max-height: 1em;" /></p>



<p>Ideally you would run your bolt.diy deployment on a separate machine from the machine running the model, but for our use case, current deployment model is more than enough.</p>



<p>Like most of the things on this blog, guess what we&#8217;re gonna use? Yes! IaC!!! <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f600.png" alt="😀" class="wp-smiley" style="height: 1em; max-height: 1em;" /></p>



<p>Here&#8217;s a link to the <a href="https://github.com/aslepcev/linode-bolt.diy" target="_blank" rel="noopener" title="">Github repository</a> containing the Terraform code. </p>



<p>Code will do the following:</p>



<ol class="wp-block-list">
<li>Deploy a GPU based instance in Akamai Connected Cloud</li>



<li>Use cloud-init to install the following:
<ul class="wp-block-list">
<li>curl</li>



<li>wget</li>



<li>nodejs</li>



<li>npm</li>



<li>nvtop &#8211; great tool to monitor your GPU usage</li>



<li>Nvidia drivers</li>
</ul>
</li>



<li>Deploy and configure a firewall which will allow SSH and <strong>bolt.diy</strong> access from your IP. </li>



<li>Configure <strong>bolt </strong>and <strong>ollama </strong>to run as a Linux service. For <strong>ollama </strong>service, we are always making sure we have a model downloaded and created with 32K context size. </li>
</ol>



<h2 class="wp-block-heading">How do you deploy it? </h2>



<p>Just fill in your Linode API token and the desired region, Linode token and your IP address in <strong>variables.tf </strong>file and run the following commands:</p>



<pre class="wp-block-code"><code>git clone https://github.com/aslepcev/linode-bolt.diy
cd linode-bolt.diy
#Fill in the variables.tf file now
terrafom init
terraform plan
terraform apply</code></pre>



<p>After a short 5-6 minute wait, everything should be deployed and ready to use. Go ahead and visit the IP address of your VM on the port 5173. </p>



<p><strong>Example url</strong>: http://172.233.246.209:5173</p>



<p>Make sure that Ollama is selected as a provider and you&#8217;re off to the races!</p>



<h2 class="wp-block-heading">What can it do? </h2>



<p>Well, it really depends on the model we are running. With the <strong><a href="https://www.linode.com/pricing/#compute-gpu" target="_blank" rel="noopener" title="">RTX 4000 Ada GPU</a></strong>, we can comfortably run a <strong>14B parameter model with 32K context size</strong> which is &#8220;ok&#8221; for smaller and simpler stuff. </p>



<p>I tested it out with a simple task of creating a Tic-Tac-Toe game in NodeJS. It got the functionality right the first time, but it looked like something only a mother could love <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f642.png" alt="🙂" class="wp-smiley" style="height: 1em; max-height: 1em;" /></p>



<figure class="wp-block-image size-large"><img fetchpriority="high" decoding="async" width="1024" height="505" src="https://blog.slepcevic.net/wp-content/uploads/2025/02/bolt1-1024x505.png" alt="" class="wp-image-770" srcset="https://blog.slepcevic.net/wp-content/uploads/2025/02/bolt1-1024x505.png 1024w, https://blog.slepcevic.net/wp-content/uploads/2025/02/bolt1-300x148.png 300w, https://blog.slepcevic.net/wp-content/uploads/2025/02/bolt1-768x379.png 768w, https://blog.slepcevic.net/wp-content/uploads/2025/02/bolt1-1536x758.png 1536w, https://blog.slepcevic.net/wp-content/uploads/2025/02/bolt1.png 1615w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure>



<p>I just told it to make it a bit prettier and add some color; these were the results I got:</p>



<figure class="wp-block-image size-full"><img decoding="async" width="694" height="589" src="https://blog.slepcevic.net/wp-content/uploads/2025/02/bolt-3.png" alt="" class="wp-image-771" srcset="https://blog.slepcevic.net/wp-content/uploads/2025/02/bolt-3.png 694w, https://blog.slepcevic.net/wp-content/uploads/2025/02/bolt-3-300x255.png 300w" sizes="(max-width: 694px) 100vw, 694px" /></figure>



<p>Interestingly, during the coding process, it made a mistake which it managed to identify and fix all on its own! All I did was press the &#8220;<strong>Ask Bolt</strong>&#8221; button. </p>



<p></p>



<figure class="wp-block-image size-full is-resized"><img loading="lazy" decoding="async" width="616" height="251" src="https://blog.slepcevic.net/wp-content/uploads/2025/02/bolt-fix.png" alt="" class="wp-image-772" style="width:838px;height:auto" srcset="https://blog.slepcevic.net/wp-content/uploads/2025/02/bolt-fix.png 616w, https://blog.slepcevic.net/wp-content/uploads/2025/02/bolt-fix-300x122.png 300w" sizes="auto, (max-width: 616px) 100vw, 616px" /></figure>



<p>Also, here&#8217;s a fully functioning Space Invaders alike game which it also wrote</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="975" height="665" src="https://blog.slepcevic.net/wp-content/uploads/2025/02/bolt-space.png" alt="" class="wp-image-774" srcset="https://blog.slepcevic.net/wp-content/uploads/2025/02/bolt-space.png 975w, https://blog.slepcevic.net/wp-content/uploads/2025/02/bolt-space-300x205.png 300w, https://blog.slepcevic.net/wp-content/uploads/2025/02/bolt-space-768x524.png 768w" sizes="auto, (max-width: 975px) 100vw, 975px" /></figure>



<h2 class="wp-block-heading">What if I want to run a larger model? 32B parameters or even larger? </h2>



<p>That&#8217;s very easy! Since Ollama can use multiple GPU&#8217;s, all we need to do is scale up the VM we are using to the one which includes two or more GPU&#8217;s. Akamai offers maximum of 4 GPU&#8217;s per VM which brings up to 80 GB of VRAM which we can use to run our model. I will not experiment with larger models in this blog post; this is something we will benchmark and try out in the future. </p>



<p>Cheers! Alex.</p>



<p>P.S &#8211; parts of this post were written by bolt.diy <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f642.png" alt="🙂" class="wp-smiley" style="height: 1em; max-height: 1em;" /></p>



<p></p><p>The post <a href="https://blog.slepcevic.net/your-own-ai-coding-assistant-running-on-akamai-cloud/">Your own AI coding assistant running on Akamai cloud!</a> first appeared on <a href="https://blog.slepcevic.net">Architect the cloud</a>.</p>]]></content:encoded>
					
					<wfw:commentRss>https://blog.slepcevic.net/your-own-ai-coding-assistant-running-on-akamai-cloud/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>Sentiment analysis of 40 thousand movie reviews in 20 minutes using Neural Magic&#8217;s DeepSparse inference runtime and Linode virtual machines.</title>
		<link>https://blog.slepcevic.net/sentiment-analysis-of-40-thousand-movie-reviews-in-20-minutes-using-neural-magics-deepsparse-inference-runtime-and-linode/</link>
					<comments>https://blog.slepcevic.net/sentiment-analysis-of-40-thousand-movie-reviews-in-20-minutes-using-neural-magics-deepsparse-inference-runtime-and-linode/#respond</comments>
		
		<dc:creator><![CDATA[Alesandro Slepčević]]></dc:creator>
		<pubDate>Sat, 30 Mar 2024 22:54:00 +0000</pubDate>
				<category><![CDATA[Akamai Connected Cloud]]></category>
		<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Cloud]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[Open Source]]></category>
		<category><![CDATA[Terraform]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[DeepSparse]]></category>
		<category><![CDATA[Linode]]></category>
		<category><![CDATA[Neural Magic]]></category>
		<guid isPermaLink="false">https://blog.slepcevic.net/?p=287</guid>

					<description><![CDATA[<p>First, let me start with a word or two about DeepSparse. DeepSparse is a sparsity-aware inference runtime that delivers GPU-class performance on commodity CPUs, purely in software, anywhere. GPUs Are Not Optimal &#8211; Machine learning inference has evolved over the...</p>
<p>The post <a href="https://blog.slepcevic.net/sentiment-analysis-of-40-thousand-movie-reviews-in-20-minutes-using-neural-magics-deepsparse-inference-runtime-and-linode/">Sentiment analysis of 40 thousand movie reviews in 20 minutes using Neural Magic’s DeepSparse inference runtime and Linode virtual machines.</a> first appeared on <a href="https://blog.slepcevic.net">Architect the cloud</a>.</p>]]></description>
										<content:encoded><![CDATA[<p>First, let me start with a word or two about DeepSparse. </p>



<p><strong>DeepSparse is a sparsity-aware inference runtime that delivers GPU-class performance on commodity CPUs, purely in software, anywhere.</strong></p>



<p><strong>GPUs Are Not Optimal</strong> &#8211; Machine learning inference has evolved over the years led by GPU advancements. GPUs are fast and powerful, but they can be expensive, have shorter life spans, and require a lot of electricity and cooling.</p>



<p>Other major problems with GPU&#8217;s, especially if you&#8217;re thinking in the context of <a href="https://www.akamai.com/newsroom/press-release/akamai-takes-cloud-computing-to-the-edge" title="">Edge computing</a>, is that they can&#8217;t be packed as densely and are power ineffective compared to CPU&#8217;s; not to mention availability these days.</p>



<p>Since Akamai recently partnered up with Neural Magic, I&#8217;ve decided to write a quick tutorial on how to easily get started with running a simple <strong>DeepSparse sentiment analysis workload</strong>. </p>



<p>In case you want more about Akamai and Neural Magic&#8217;s partnership, make sure to watch this excellent video from TFiR. It will also give you a great summary of Akamai&#8217;s Project Gecko.</p>



<figure class="wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio"><div class="wp-block-embed__wrapper">
<div class="responsive-embed widescreen"><iframe loading="lazy" title="Akamai partners with Neural Magic to bring AI to edge use cases" width="1000" height="563" src="https://www.youtube.com/embed/MG45UM4SlbQ?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe></div>
</div></figure>



<p></p>



<h3 class="wp-block-heading">What is Sentiment analysis?</h3>



<p><strong>Sentiment analysis</strong>&nbsp;(also known as&nbsp;<strong>opinion mining</strong>&nbsp;or&nbsp;<strong>emotion AI</strong>) is the use of&nbsp;<a href="https://en.wikipedia.org/wiki/Natural_language_processing">natural language processing</a>,&nbsp;<a href="https://en.wikipedia.org/wiki/Text_analytics">text analysis</a>,&nbsp;<a href="https://en.wikipedia.org/wiki/Computational_linguistics">computational linguistics</a>, and&nbsp;<a href="https://en.wikipedia.org/wiki/Biometrics">biometrics</a>&nbsp;to systematically identify, extract, quantify, and study affective states and subjective information. Sentiment analysis is widely applied to&nbsp;<a href="https://en.wikipedia.org/wiki/Voice_of_the_customer">voice of the customer</a>&nbsp;materials such as reviews and survey responses, online and social media, and healthcare materials for applications that range from&nbsp;<a href="https://en.wikipedia.org/wiki/Marketing">marketing</a>&nbsp;to&nbsp;<a href="https://en.wikipedia.org/wiki/Customer_relationship_management">customer service</a>&nbsp;to clinical medicine.&nbsp;</p>



<p>Why is DeepSparse cool? Because I&#8217;m doing analysis of 40 thousands movie reviews in 20 minutes using only <strong>TWO DUAL CORE Linode VM&#8217;s. Mind officially blown. </strong></p>



<p></p>



<p></p>



<p>Let&#8217;s do some math here; rounding it up to 120 thousand processed reviews an hour, with 2 instances and a load balancer, we can process over<strong> 86 million requests a month</strong> which will cost you a <strong>staggering 82$ <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f600.png" alt="😀" class="wp-smiley" style="height: 1em; max-height: 1em;" /></strong>. </p>



<p><strong>If you&#8217;re doing that on other cloud providers, you&#8217;re paying a five digit monthly bill for that pleasure. </strong></p>



<p></p>



<h2 class="wp-block-heading">Want to try it yourself? It&#8217;s easy!</h2>



<p>If you want to try it out on Linode, follow instructions below. </p>



<p>If you want to check out Neural Magic DeepSparse repo, head out <a href="https://github.com/neuralmagic/deepsparse" target="_blank" rel="noopener" title="">here</a>.</p>



<p><strong>Step 1.  Clone the Repository</strong>.</p>



<p>Open your terminal or command prompt and run the following command: </p>



<pre class="wp-block-code"><code>git clone https://github.com/slepix/neuralmagic-linode</code></pre>



<p>This code will deploy <strong>2 x Dedicated 4 GB</strong> virtual machines and a <strong>Nodebalancer</strong>. It will also install Neural Magic&#8217;s DeepSparse runtime as a Linux service and  install &amp; configure Nginx to proxy requests to DeepSparse server listening on 127.0.0.1:5543. </p>



<p class="has-vivid-red-color has-text-color has-link-color wp-elements-f0a0be9819cd2bc070b1912e9e812e46"><strong>WARNING: THIS IS NOT PRODUCTION GRADE SERVER CONFIGURATION!</strong></p>



<p class="has-vivid-red-color has-text-color has-link-color wp-elements-42cceec5227b00bdbbe795cf30c2587a">It&#8217;s just a POC! Secure your servers and consult Neural Magic documentation if you want to go to production. </p>



<p><strong>Step 2. </strong>&#8211; <strong>Terraform init</strong></p>



<p>Navigate to the repo using the following command: </p>



<pre class="wp-block-code"><code>cd neuralmagic-linode</code></pre>



<p>If you haven&#8217;t already installed Terraform on your machine, you can download it from the <a href="https://developer.hashicorp.com/terraform/install?product_intent=terraform" target="_blank" rel="noopener" title="">official Terraform website</a> and follow the installation instructions for your operating system.</p>



<p><strong>Step 3. </strong></p>



<p>Initialize Terraform by running:</p>



<pre class="wp-block-code"><code>terraform init</code></pre>



<p><strong>Step 4. </strong>&#8211; <strong>Configure your Linode token</strong></p>



<p>Open <strong>variables.tf </strong>file and paste in your Linode token. If you don&#8217;t know how to create a Linode PAT, check this article <strong><a href="https://www.linode.com/docs/products/tools/api/guides/manage-api-tokens/" target="_blank" rel="noopener" title="">here</a></strong>. It should look similar like the picture. You can also adjust the region while you&#8217;re here <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f600.png" alt="😀" class="wp-smiley" style="height: 1em; max-height: 1em;" /></p>



<pre class="wp-block-preformatted">Token in the picture is not valid. It's just an example. </pre>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="772" height="275" src="https://blog.slepcevic.net/wp-content/uploads/2024/03/tokendemo.png" alt="" class="wp-image-293" srcset="https://blog.slepcevic.net/wp-content/uploads/2024/03/tokendemo.png 772w, https://blog.slepcevic.net/wp-content/uploads/2024/03/tokendemo-300x107.png 300w, https://blog.slepcevic.net/wp-content/uploads/2024/03/tokendemo-768x274.png 768w" sizes="auto, (max-width: 772px) 100vw, 772px" /></figure>



<p><strong>Step 5</strong> &#8211; <strong>Run Terraform apply</strong></p>



<p>After configuring your variables, you can apply the Terraform configuration by running:</p>



<pre class="wp-block-code"><code>terraform apply</code></pre>



<p>Terraform will show you a plan of the changes it intends to make. </p>



<p>Review the plan carefully, and if everything looks good, type &#8220;<code><strong>yes"</strong></code> and press Enter to apply the changes. Give it 5-6 minutes to finish everything and by visiting your Nodebalancer IP, you should be presented with a landing page for DeepSparse server API. </p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="518" src="https://blog.slepcevic.net/wp-content/uploads/2024/03/Screenshot-2024-03-31-233941-1024x518.png" alt="" class="wp-image-294" srcset="https://blog.slepcevic.net/wp-content/uploads/2024/03/Screenshot-2024-03-31-233941-1024x518.png 1024w, https://blog.slepcevic.net/wp-content/uploads/2024/03/Screenshot-2024-03-31-233941-300x152.png 300w, https://blog.slepcevic.net/wp-content/uploads/2024/03/Screenshot-2024-03-31-233941-768x388.png 768w, https://blog.slepcevic.net/wp-content/uploads/2024/03/Screenshot-2024-03-31-233941.png 1457w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<p><strong>Step 6. </strong></p>



<p>After the installation is done, it&#8217;s finally time to send some data to our API and see how it performs. </p>



<p>We can do that by using <strong>curl </strong>or <strong>invoke-webrequest</strong> if you&#8217;re on Windows and using Powershell. </p>



<p><strong>CURL: </strong></p>



<pre class="wp-block-code"><code>sentence="Neural Magic &amp; Akamai are cool!"
nodebalancer="172.233.34.110" #PUT YOUR NODEBALANCER IP HERE
curl -X POST http://$nodebalancer/v2/models/sentiment_analysis/infer -H "Content-Type: application/json" -d "{\"sequences\": \"$sentence\"}"</code></pre>



<p><strong>PowerShell:</strong></p>



<p></p>



<pre class="wp-block-code"><code>$sentence = "Neural Magic &amp; Akamai are cool!"
$nodebalancer = "172.233.34.110"

$path = "v2/models/sentiment_analysis/infer"
$api = "http://$nodebalancer/$path"
$body = @{
   sequences = $sentence
} | ConvertTo-Json

(Invoke-WebRequest -Uri $api -Method Post -ContentType "application/json" -Body $body -ErrorAction Stop).content</code></pre>



<p>In both cases make sure to paste in the <strong>IP address of the Nodebalancer</strong> you deployed and modify the sentence as you wish. </p>



<h2 class="wp-block-heading">Benchmark time!</h2>



<p>In the repository, I&#8217;ve included a file called movies.csv and three files; two PowerShell and one Python file.</p>



<p><strong>movies.zip</strong> &#8211; unzip this one in the same folder where your benchmark scripts are. </p>



<p><strong>analyze.ps1</strong> &#8211; PowerShell based benchmark, sends requests in serial &#8211; not performant. </p>



<p><strong>panalyze.ps1</strong> &#8211; PowerShell based benchmark, sends requests in parallel &#8211; better performant</p>



<p><strong>pypanalyze.py</strong> &#8211; Python based benchmark, sends requests in parallel &#8211; <strong>best performer (doh!) &lt;-use this</strong></p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="581" height="153" src="http://blog.slepcevic.net/wp-content/uploads/2024/04/Screenshot-2024-04-01-014021.png" alt="" class="wp-image-327" srcset="https://blog.slepcevic.net/wp-content/uploads/2024/04/Screenshot-2024-04-01-014021.png 581w, https://blog.slepcevic.net/wp-content/uploads/2024/04/Screenshot-2024-04-01-014021-300x79.png 300w" sizes="auto, (max-width: 581px) 100vw, 581px" /></figure>



<p>All you need to do to in order to kick off a benchmark is to update the the URL variable with your Nodebalancer IP and you&#8217;re off to the races. </p>



<h2 class="wp-block-heading">Does it scale?</h2>



<p><strong>Yes!</strong> For kicks I&#8217;ve added a third node and the same job finished in 825 seconds. Feel free to add as many nodes as you like and see what numbers you can get. Additionally, you can play with the number of workers in the Python file. </p>



<pre class="wp-block-preformatted">Note 1: python script has been written with the help of ChatGPT :) Results matched with my PowerShell version against verified smaller sample size(check note 2), so I'm gonna call it good :)
	 
Note 2: PowerShell versions don't handle some comments as they should and end up sending garbage to the API. Happens in 3% of the cases. Most probably some encoding/character issue which I couldn't be bothered to fix :)

Note3: Movies.csv file has been generated by using data from https://kaggle.com/

</pre>



<p>Cheers, </p>



<p>Alex. </p>



<p></p><p>The post <a href="https://blog.slepcevic.net/sentiment-analysis-of-40-thousand-movie-reviews-in-20-minutes-using-neural-magics-deepsparse-inference-runtime-and-linode/">Sentiment analysis of 40 thousand movie reviews in 20 minutes using Neural Magic’s DeepSparse inference runtime and Linode virtual machines.</a> first appeared on <a href="https://blog.slepcevic.net">Architect the cloud</a>.</p>]]></content:encoded>
					
					<wfw:commentRss>https://blog.slepcevic.net/sentiment-analysis-of-40-thousand-movie-reviews-in-20-minutes-using-neural-magics-deepsparse-inference-runtime-and-linode/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
	</channel>
</rss>

<!--
Performance optimized by W3 Total Cache. Learn more: https://www.boldgrid.com/w3-total-cache/?utm_source=w3tc&utm_medium=footer_comment&utm_campaign=free_plugin

Page Caching using Disk: Enhanced 
Lazy Loading (feed)

Served from: blog.slepcevic.net @ 2025-12-26 06:52:58 by W3 Total Cache
-->