<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>DeepSparse - Architect the cloud</title>
	<atom:link href="https://blog.slepcevic.net/tag/deepsparse/feed/" rel="self" type="application/rss+xml" />
	<link>https://blog.slepcevic.net</link>
	<description></description>
	<lastBuildDate>Wed, 12 Jun 2024 22:32:07 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9</generator>
	<item>
		<title>Deploying a GLOBAL sentiment Analysis service using DeepSparse and Akamai Connected Cloud</title>
		<link>https://blog.slepcevic.net/deploying-a-global-sentiment-analysis-service-using-deepsparse-and-akamai-connected-cloud/</link>
					<comments>https://blog.slepcevic.net/deploying-a-global-sentiment-analysis-service-using-deepsparse-and-akamai-connected-cloud/#respond</comments>
		
		<dc:creator><![CDATA[Alesandro Slepčević]]></dc:creator>
		<pubDate>Wed, 12 Jun 2024 22:31:06 +0000</pubDate>
				<category><![CDATA[Akamai Connected Cloud]]></category>
		<category><![CDATA[Akamai Gecko]]></category>
		<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Cloud]]></category>
		<category><![CDATA[Linode Gecko]]></category>
		<category><![CDATA[Terraform]]></category>
		<category><![CDATA[Akamai CLB]]></category>
		<category><![CDATA[Akamai Cloud Load Balancer]]></category>
		<category><![CDATA[DeepSparse]]></category>
		<category><![CDATA[EdgeAI]]></category>
		<category><![CDATA[Global computing]]></category>
		<category><![CDATA[Infrastructure]]></category>
		<category><![CDATA[Linode VM]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[Neural Magic]]></category>
		<guid isPermaLink="false">https://blog.slepcevic.net/?p=441</guid>

					<description><![CDATA[<p>In the previous post, we explored how to deploy a sentiment analysis application using Neural Magic’s DeepSparse on Akamai Connected Cloud (Linode). We leveraged just two dual-core VMs and a Nodebalancer to process a pretty impressive number(40K) of movie reviews...</p>
<p>The post <a href="https://blog.slepcevic.net/deploying-a-global-sentiment-analysis-service-using-deepsparse-and-akamai-connected-cloud/">Deploying a GLOBAL sentiment Analysis service using DeepSparse and Akamai Connected Cloud</a> first appeared on <a href="https://blog.slepcevic.net">Architect the cloud</a>.</p>]]></description>
										<content:encoded><![CDATA[<p>In the <a href="https://blog.slepcevic.net/sentiment-analysis-of-40-thousand-movie-reviews-in-20-minutes-using-neural-magics-deepsparse-inference-runtime-and-linode/" target="_blank" rel="noopener" title="previous post">previous post</a>, we explored how to deploy a sentiment analysis application using Neural Magic’s DeepSparse on Akamai Connected Cloud (Linode). </p>



<p>We leveraged just two dual-core VMs and a Nodebalancer to process a pretty impressive number(40K) of movie reviews in just 20 minutes. However, deploying in a single region can lead to latency/availability issues and doesn&#8217;t fully utilize the global reach of modern cloud infrastructure Akamai Connected Cloud offers. </p>



<p>Also, single region deployments are kinda boring <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f642.png" alt="🙂" class="wp-smiley" style="height: 1em; max-height: 1em;" /></p>



<p>In this post, we&#8217;ll expand our deployment by setting up virtual machines in <strong>all available</strong> Linode regions and replacing the current <a href="https://www.linode.com/products/nodebalancers/" title="Nodebalancer ">Nodebalancer </a>with Akamai’s new <a href="https://www.linode.com/green-light/" title="Cloud Load Balancer">Cloud Load Balancer</a> (currently in beta access). </p>



<p><strong>What is Akamai&#8217;s new Cloud Load Balancer you may ask? It&#8217;s really cool peace of tech.</strong></p>



<p>Think of it like an umbrella over the internet; it gives you the possibility to load balance your workloads across ANY location; it can be on prem, Akamai&#8217;s cloud, some other hyper-scaler, heck, it can even be your home IP address if you want to <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f642.png" alt="🙂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> </p>



<p>As long as it can be reached over the internet, Cloud Load Balancer can use it to deliver the request. </p>



<p>Joking aside, here&#8217;s a more official description of the service:</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p>The Akamai Cloud Load Balancer (ACLB) (formerly referred to as Akamai Global Load Balancer) is a layer 4 and 7 load balancer that distributes traffic based on performance, weight, and content (HTTP headers, query strings, etc.). The ACLB is multi-region, independent of Akamai Delivery, and built for East-West and North-South traffic. Key features include multi-region/multi-cloud load balancing and method selection.</p>
</blockquote>



<figure class="wp-block-image size-large"><img fetchpriority="high" decoding="async" width="1024" height="259" src="https://blog.slepcevic.net/wp-content/uploads/2024/06/clb3-1024x259.png" alt="" class="wp-image-447" srcset="https://blog.slepcevic.net/wp-content/uploads/2024/06/clb3-1024x259.png 1024w, https://blog.slepcevic.net/wp-content/uploads/2024/06/clb3-300x76.png 300w, https://blog.slepcevic.net/wp-content/uploads/2024/06/clb3-768x195.png 768w, https://blog.slepcevic.net/wp-content/uploads/2024/06/clb3.png 1303w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure>



<h3 class="wp-block-heading">Why Scale Globally?</h3>



<p>Scaling out our application across multiple regions has several benefits:</p>



<ol class="wp-block-list">
<li><strong>Reduced Latency</strong>: By having servers closer to our users, we can significantly reduce the time it takes for requests to travel back and forth.</li>



<li><strong>High Availability</strong>: Distributing the load across multiple regions ensures that if one region goes down, well, we kinda don&#8217;t care, our app stays online <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f642.png" alt="🙂" class="wp-smiley" style="height: 1em; max-height: 1em;" /></li>



<li><strong>Better Performance</strong>: I mean, you can&#8217;t beat physics; simply having the possibility to do compute closer to the user improves performance and user experience.</li>
</ol>



<h3 class="wp-block-heading">Step-by-Step Deployment Guide</h3>



<p>Ok, let&#8217;s get into the meaty part; codebase from our previous post hasn&#8217;t changed significantly; only thing which we changed is that we&#8217;re not hardcoding out region anymore, but we are fetching the list of available regions from Linode API and deploying an instance in each region. </p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p>In the code which fetches the regions, you will notice that I commented out the authentication part. </p>



<p>Some regions are available only to authenticated users; if you&#8217;re one of those, just uncomment those few lines and the full region list will be returned to you. </p>
</blockquote>



<p>Let&#8217;s start with terraform plan and see what we will create. </p>



<pre class="wp-block-code"><code>terraform plan</code></pre>



<figure class="wp-block-image size-full"><img decoding="async" width="942" height="690" src="https://blog.slepcevic.net/wp-content/uploads/2024/06/1.png" alt="" class="wp-image-449" srcset="https://blog.slepcevic.net/wp-content/uploads/2024/06/1.png 942w, https://blog.slepcevic.net/wp-content/uploads/2024/06/1-300x220.png 300w, https://blog.slepcevic.net/wp-content/uploads/2024/06/1-768x563.png 768w" sizes="(max-width: 942px) 100vw, 942px" /></figure>



<p>Ok, 25 instances, just what we expect since Akamai has 25 compute regions currently publicly available. </p>



<p>Let&#8217;s proceed with terraform apply </p>



<pre class="wp-block-code"><code>terraform apply</code></pre>



<figure class="wp-block-image size-large"><img decoding="async" width="1024" height="690" src="https://blog.slepcevic.net/wp-content/uploads/2024/06/2-1024x690.png" alt="" class="wp-image-450" srcset="https://blog.slepcevic.net/wp-content/uploads/2024/06/2-1024x690.png 1024w, https://blog.slepcevic.net/wp-content/uploads/2024/06/2-300x202.png 300w, https://blog.slepcevic.net/wp-content/uploads/2024/06/2-768x518.png 768w, https://blog.slepcevic.net/wp-content/uploads/2024/06/2.png 1300w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p>I don&#8217;t know about you, but I always nerd out on seeing a bunch of servers popping up in the console <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f600.png" alt="😀" class="wp-smiley" style="height: 1em; max-height: 1em;" /></p>
</blockquote>



<p>In a minute or two we should have all instances deployed. After the instances have been deployed, <strong>cloud-init</strong> will kick off and install DeepSparse server with an Nginx proxy in front (check out the previous post or the YAML file in the repo for more details).</p>



<p><strong>Awesome</strong>! After we&#8217;ve got the infrastructure up and running, last step is to add our nodes to the Cloud load balancer pool; at the moment we will need to do some Clickops <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f641.png" alt="🙁" class="wp-smiley" style="height: 1em; max-height: 1em;" /> CLB service is currently in beta so IaC support isn&#8217;t out yet. </p>



<p>First step is creating a Cloud Load Balancer by clicking the &#8220;<strong>Create Cloud Load Balancer</strong>&#8221; button and giving it a name.</p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="410" src="https://blog.slepcevic.net/wp-content/uploads/2024/06/4-1024x410.png" alt="" class="wp-image-452" srcset="https://blog.slepcevic.net/wp-content/uploads/2024/06/4-1024x410.png 1024w, https://blog.slepcevic.net/wp-content/uploads/2024/06/4-300x120.png 300w, https://blog.slepcevic.net/wp-content/uploads/2024/06/4-768x307.png 768w, https://blog.slepcevic.net/wp-content/uploads/2024/06/4.png 1322w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p>During the beta period, Cloud Load Balancer is deployed only in 5 locations. This number will grow drastically once the service goes GA. </p>
</blockquote>



<p>With Akamai&#8217;s Cloud Load Balancer, everything starts with a &#8220;<strong>Configuration</strong>&#8220;. Let&#8217;s create one by pressing &#8220;<strong>Add Configuration</strong>&#8221; button. </p>



<p>We will configure our load balancer to &#8220;listen&#8221; on both HTTP and HTTPS. Once we selected HTTPS as our protocol, we need to add a certificate. </p>



<p>In order to do that, we need to prepare our <strong>certificate</strong> and <strong>private key</strong> which we will paste into the configuration field.<em> In this case I will use a self-signed certificate. </em></p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1298" height="578" src="https://blog.slepcevic.net/wp-content/uploads/2024/06/6-1024x456.png" alt="" class="wp-image-453" srcset="https://blog.slepcevic.net/wp-content/uploads/2024/06/6-1024x456.png 1024w, https://blog.slepcevic.net/wp-content/uploads/2024/06/6-300x134.png 300w, https://blog.slepcevic.net/wp-content/uploads/2024/06/6-768x342.png 768w, https://blog.slepcevic.net/wp-content/uploads/2024/06/6.png 1298w" sizes="auto, (max-width: 1298px) 100vw, 1298px" /></figure>



<p>At this stage we will only cover the configuration for the HTTPS protocol, HTTP is really easy and won&#8217;t bother wasting your time on it. </p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="485" height="886" src="https://blog.slepcevic.net/wp-content/uploads/2024/06/7.png" alt="" class="wp-image-454" srcset="https://blog.slepcevic.net/wp-content/uploads/2024/06/7.png 485w, https://blog.slepcevic.net/wp-content/uploads/2024/06/7-164x300.png 164w" sizes="auto, (max-width: 485px) 100vw, 485px" /></figure>
</div>


<p>We need to paste in the certificate &amp; the key, enter the SNI hostname and press &#8220;<strong>Create and Add</strong>&#8221; button. </p>



<p>After we&#8217;ve got the configuration and the certificate added, we need to create a &#8220;<strong>Route</strong>&#8220;. Let&#8217;s click on  &#8220;<strong>Create a New HTTP Route</strong>&#8221; button and give it a name. </p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="485" height="483" src="https://blog.slepcevic.net/wp-content/uploads/2024/06/9.png" alt="" class="wp-image-455" srcset="https://blog.slepcevic.net/wp-content/uploads/2024/06/9.png 485w, https://blog.slepcevic.net/wp-content/uploads/2024/06/9-300x300.png 300w, https://blog.slepcevic.net/wp-content/uploads/2024/06/9-150x150.png 150w" sizes="auto, (max-width: 485px) 100vw, 485px" /></figure>
</div>


<p>Great, we&#8217;ve created a route, but the route is currently empty and it doesn&#8217;t route anything. We will come back to this a bit later. </p>



<p>Next step is to save our configuration and click on &#8220;<strong>Service Targets</strong>&#8221; tab. </p>



<p>This is the place where we will define our target groups and origin servers. Click on &#8220;<strong>New Service Target</strong>&#8221; button</p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="349" src="https://blog.slepcevic.net/wp-content/uploads/2024/06/10-1024x349.png" alt="" class="wp-image-456" srcset="https://blog.slepcevic.net/wp-content/uploads/2024/06/10-1024x349.png 1024w, https://blog.slepcevic.net/wp-content/uploads/2024/06/10-300x102.png 300w, https://blog.slepcevic.net/wp-content/uploads/2024/06/10-768x262.png 768w, https://blog.slepcevic.net/wp-content/uploads/2024/06/10.png 1301w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<p>Next steps are quite explanatory, we need to give it a name and add the nodes which we want to load balance across. </p>



<p><strong>Remember, this can be one of the existing Linode instance or it can be ANY IP address which can be reached via the internet. </strong></p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="440" src="https://blog.slepcevic.net/wp-content/uploads/2024/06/11-1024x440.png" alt="" class="wp-image-457" srcset="https://blog.slepcevic.net/wp-content/uploads/2024/06/11-1024x440.png 1024w, https://blog.slepcevic.net/wp-content/uploads/2024/06/11-300x129.png 300w, https://blog.slepcevic.net/wp-content/uploads/2024/06/11-768x330.png 768w, https://blog.slepcevic.net/wp-content/uploads/2024/06/11-1536x660.png 1536w, https://blog.slepcevic.net/wp-content/uploads/2024/06/11.png 1801w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<p>This is the step where I could really use IaC support, we need to add all 25 servers by using ClickOps to our &#8220;<strong>Endpoints</strong>&#8221; list.  <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f642.png" alt="🙂" class="wp-smiley" style="height: 1em; max-height: 1em;" /></p>



<p>This is also the place where you can also select the <strong>load balancing algorithm</strong> which will be used to balance requests between the nodes. At the moment there are 5 of them available:</p>



<ul class="wp-block-list">
<li><strong>Round Robin</strong></li>



<li><strong>Least Request</strong></li>



<li><strong>Ring Hash</strong></li>



<li><strong>Random</strong></li>



<li><strong>Maglev</strong></li>
</ul>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="547" src="https://blog.slepcevic.net/wp-content/uploads/2024/06/15-1024x547.png" alt="" class="wp-image-460" srcset="https://blog.slepcevic.net/wp-content/uploads/2024/06/15-1024x547.png 1024w, https://blog.slepcevic.net/wp-content/uploads/2024/06/15-300x160.png 300w, https://blog.slepcevic.net/wp-content/uploads/2024/06/15-768x410.png 768w, https://blog.slepcevic.net/wp-content/uploads/2024/06/15-1536x821.png 1536w, https://blog.slepcevic.net/wp-content/uploads/2024/06/15.png 1783w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<p>Last step in &#8220;<strong>Service Target</strong>&#8221; configuration is to set the path and host header we will use for the health checks on the nodes and click on &#8220;<strong>Save Service Target</strong>&#8221; button. </p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="508" src="https://blog.slepcevic.net/wp-content/uploads/2024/06/12-1024x508.png" alt="" class="wp-image-458" srcset="https://blog.slepcevic.net/wp-content/uploads/2024/06/12-1024x508.png 1024w, https://blog.slepcevic.net/wp-content/uploads/2024/06/12-300x149.png 300w, https://blog.slepcevic.net/wp-content/uploads/2024/06/12-768x381.png 768w, https://blog.slepcevic.net/wp-content/uploads/2024/06/12-1536x762.png 1536w, https://blog.slepcevic.net/wp-content/uploads/2024/06/12.png 1807w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<p><strong>We&#8217;re almost there, I promise <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f600.png" alt="😀" class="wp-smiley" style="height: 1em; max-height: 1em;" /> </strong></p>



<p>Final step is to go back to the &#8220;<strong>Routes</strong>&#8221; tab, click on the route which we&#8217;ve created earlier and click on Edit button.</p>



<p>In the rule configuration we will enter the hostname which we want to match upon and select our &#8220;<strong>Service target</strong>&#8221; from the dropdown. </p>



<p>We can also do advanced request matching based on <strong>path, header, method, regex or query string</strong> but for now we will use path prefix. </p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="535" src="https://blog.slepcevic.net/wp-content/uploads/2024/06/13-1-1024x535.png" alt="" class="wp-image-461" srcset="https://blog.slepcevic.net/wp-content/uploads/2024/06/13-1-1024x535.png 1024w, https://blog.slepcevic.net/wp-content/uploads/2024/06/13-1-300x157.png 300w, https://blog.slepcevic.net/wp-content/uploads/2024/06/13-1-768x401.png 768w, https://blog.slepcevic.net/wp-content/uploads/2024/06/13-1-1536x803.png 1536w, https://blog.slepcevic.net/wp-content/uploads/2024/06/13-1.png 1814w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<p>Ok, we have configured our Cloud Load balancer; final step to get our application running is to create a CNAME record for my subdomain &#8220;<strong>clbtest.slepcevic.net</strong>&#8221; and point it to &#8220;<strong>MDEE053110.mesh.akadns.net</strong>&#8221; (<em>visible in the Summary page of the load balancer</em>). </p>



<p>Let&#8217;s go ahead and visit our website. We see that our DeepSparse API server is happily responding and ready to receive requests! Woohoo!</p>



<p><strong>Yes, it&#8217;s that easy. </strong>In less than 10 minutes we have deployed a globally distributed application on Akamai Connected Cloud. Once IaC support for Cloud Load Balancer is rolled out, we can bring this time down to 5 minutes without any problems. </p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="649" src="https://blog.slepcevic.net/wp-content/uploads/2024/06/16-1024x649.png" alt="" class="wp-image-462" srcset="https://blog.slepcevic.net/wp-content/uploads/2024/06/16-1024x649.png 1024w, https://blog.slepcevic.net/wp-content/uploads/2024/06/16-300x190.png 300w, https://blog.slepcevic.net/wp-content/uploads/2024/06/16-768x487.png 768w, https://blog.slepcevic.net/wp-content/uploads/2024/06/16.png 1264w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<h2 class="wp-block-heading">Ok, 25 regions is cool, but that isn&#8217;t truly global is it? <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f642.png" alt="🙂" class="wp-smiley" style="height: 1em; max-height: 1em;" /></h2>



<p>Yes, you&#8217;re right; with 25 regions we have covered the large majority of the global (internet) population. Can we do better? For sure! Welcome Gecko!</p>



<h3 class="wp-block-heading">Gecko?</h3>



<p>Akamai’s new initiative, code-named Gecko, is set to revolutionize cloud computing by integrating cloud capabilities directly into Akamai&#8217;s extensive edge network. This move aligns perfectly with Akamai’s strategy to provide high-performance, low-latency, and globally scalable solutions. By embedding compute capabilities at or VERY near the edge, Gecko aims to deliver workloads closer to users, devices, and data sources than ever before.</p>



<h3 class="wp-block-heading">What Does Gecko Mean for Our Deployment in the future?</h3>



<p>Gecko&#8217;s will enable us to deploy our sentiment analysis application in hundreds of new locations worldwide, <strong>including traditionally hard-to-reach areas</strong>. This means extremely reduced latency, improved performance, and enhanced availability for users across the world. </p>



<h3 class="wp-block-heading">The Benefits of Deploying on Gecko</h3>



<ol class="wp-block-list">
<li><strong>Ultra-Low Latency</strong>: By running our workloads even closer to end-users, we can drastically reduce the time it takes to process and respond to requests.</li>



<li><strong>Global Reach</strong>: With Gecko, we can deploy in cities and regions where traditional cloud providers struggle to reach, ensuring a truly global presence. </li>



<li><strong>Scalability and Flexibility</strong>: With Akamai&#8217;s large compute footprint, we can scale out our application to tens of thousands of nodes across hundreds of locations. </li>



<li><strong>Consistent Experience</strong>: Let&#8217;s be real, if you&#8217;re running a global application, you&#8217;re most probably dealing with multiple providers; with Gecko we can consolidate all of your workloads and location coverage with a single provider. Just the operational benefits of that should be enough to &#8220;tickle&#8221; your brain into considering it for your application. </li>
</ol>



<h2 class="wp-block-heading">Want to try it yourself?</h2>



<p>Have fun <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f642.png" alt="🙂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Go ahead and clone the repo from <a href="https://github.com/slepix/neuralmagic-globalLinode" target="_blank" rel="noopener" title="">https://github.com/slepix/neuralmagic-globalLinode</a> and sign up to receive beta access to the new Cloud Load balancer on <a href="https://www.linode.com/green-light/" target="_blank" rel="noopener" title="">https://www.linode.com/green-light/</a> .  </p>



<p>NeuralMagic running across all Linode (including Gecko) regions in the next post? Perhaps <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f642.png" alt="🙂" class="wp-smiley" style="height: 1em; max-height: 1em;" /></p>



<p>Cheers! Alex</p><p>The post <a href="https://blog.slepcevic.net/deploying-a-global-sentiment-analysis-service-using-deepsparse-and-akamai-connected-cloud/">Deploying a GLOBAL sentiment Analysis service using DeepSparse and Akamai Connected Cloud</a> first appeared on <a href="https://blog.slepcevic.net">Architect the cloud</a>.</p>]]></content:encoded>
					
					<wfw:commentRss>https://blog.slepcevic.net/deploying-a-global-sentiment-analysis-service-using-deepsparse-and-akamai-connected-cloud/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>Sentiment analysis of 40 thousand movie reviews in 20 minutes using Neural Magic&#8217;s DeepSparse inference runtime and Linode virtual machines.</title>
		<link>https://blog.slepcevic.net/sentiment-analysis-of-40-thousand-movie-reviews-in-20-minutes-using-neural-magics-deepsparse-inference-runtime-and-linode/</link>
					<comments>https://blog.slepcevic.net/sentiment-analysis-of-40-thousand-movie-reviews-in-20-minutes-using-neural-magics-deepsparse-inference-runtime-and-linode/#respond</comments>
		
		<dc:creator><![CDATA[Alesandro Slepčević]]></dc:creator>
		<pubDate>Sat, 30 Mar 2024 22:54:00 +0000</pubDate>
				<category><![CDATA[Akamai Connected Cloud]]></category>
		<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Cloud]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[Open Source]]></category>
		<category><![CDATA[Terraform]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[DeepSparse]]></category>
		<category><![CDATA[Linode]]></category>
		<category><![CDATA[Neural Magic]]></category>
		<guid isPermaLink="false">https://blog.slepcevic.net/?p=287</guid>

					<description><![CDATA[<p>First, let me start with a word or two about DeepSparse. DeepSparse is a sparsity-aware inference runtime that delivers GPU-class performance on commodity CPUs, purely in software, anywhere. GPUs Are Not Optimal &#8211; Machine learning inference has evolved over the...</p>
<p>The post <a href="https://blog.slepcevic.net/sentiment-analysis-of-40-thousand-movie-reviews-in-20-minutes-using-neural-magics-deepsparse-inference-runtime-and-linode/">Sentiment analysis of 40 thousand movie reviews in 20 minutes using Neural Magic’s DeepSparse inference runtime and Linode virtual machines.</a> first appeared on <a href="https://blog.slepcevic.net">Architect the cloud</a>.</p>]]></description>
										<content:encoded><![CDATA[<p>First, let me start with a word or two about DeepSparse. </p>



<p><strong>DeepSparse is a sparsity-aware inference runtime that delivers GPU-class performance on commodity CPUs, purely in software, anywhere.</strong></p>



<p><strong>GPUs Are Not Optimal</strong> &#8211; Machine learning inference has evolved over the years led by GPU advancements. GPUs are fast and powerful, but they can be expensive, have shorter life spans, and require a lot of electricity and cooling.</p>



<p>Other major problems with GPU&#8217;s, especially if you&#8217;re thinking in the context of <a href="https://www.akamai.com/newsroom/press-release/akamai-takes-cloud-computing-to-the-edge" title="">Edge computing</a>, is that they can&#8217;t be packed as densely and are power ineffective compared to CPU&#8217;s; not to mention availability these days.</p>



<p>Since Akamai recently partnered up with Neural Magic, I&#8217;ve decided to write a quick tutorial on how to easily get started with running a simple <strong>DeepSparse sentiment analysis workload</strong>. </p>



<p>In case you want more about Akamai and Neural Magic&#8217;s partnership, make sure to watch this excellent video from TFiR. It will also give you a great summary of Akamai&#8217;s Project Gecko.</p>



<figure class="wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio"><div class="wp-block-embed__wrapper">
<div class="responsive-embed widescreen"><iframe loading="lazy" title="Akamai partners with Neural Magic to bring AI to edge use cases" width="1000" height="563" src="https://www.youtube.com/embed/MG45UM4SlbQ?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe></div>
</div></figure>



<p></p>



<h3 class="wp-block-heading">What is Sentiment analysis?</h3>



<p><strong>Sentiment analysis</strong>&nbsp;(also known as&nbsp;<strong>opinion mining</strong>&nbsp;or&nbsp;<strong>emotion AI</strong>) is the use of&nbsp;<a href="https://en.wikipedia.org/wiki/Natural_language_processing">natural language processing</a>,&nbsp;<a href="https://en.wikipedia.org/wiki/Text_analytics">text analysis</a>,&nbsp;<a href="https://en.wikipedia.org/wiki/Computational_linguistics">computational linguistics</a>, and&nbsp;<a href="https://en.wikipedia.org/wiki/Biometrics">biometrics</a>&nbsp;to systematically identify, extract, quantify, and study affective states and subjective information. Sentiment analysis is widely applied to&nbsp;<a href="https://en.wikipedia.org/wiki/Voice_of_the_customer">voice of the customer</a>&nbsp;materials such as reviews and survey responses, online and social media, and healthcare materials for applications that range from&nbsp;<a href="https://en.wikipedia.org/wiki/Marketing">marketing</a>&nbsp;to&nbsp;<a href="https://en.wikipedia.org/wiki/Customer_relationship_management">customer service</a>&nbsp;to clinical medicine.&nbsp;</p>



<p>Why is DeepSparse cool? Because I&#8217;m doing analysis of 40 thousands movie reviews in 20 minutes using only <strong>TWO DUAL CORE Linode VM&#8217;s. Mind officially blown. </strong></p>



<p></p>



<p></p>



<p>Let&#8217;s do some math here; rounding it up to 120 thousand processed reviews an hour, with 2 instances and a load balancer, we can process over<strong> 86 million requests a month</strong> which will cost you a <strong>staggering 82$ <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f600.png" alt="😀" class="wp-smiley" style="height: 1em; max-height: 1em;" /></strong>. </p>



<p><strong>If you&#8217;re doing that on other cloud providers, you&#8217;re paying a five digit monthly bill for that pleasure. </strong></p>



<p></p>



<h2 class="wp-block-heading">Want to try it yourself? It&#8217;s easy!</h2>



<p>If you want to try it out on Linode, follow instructions below. </p>



<p>If you want to check out Neural Magic DeepSparse repo, head out <a href="https://github.com/neuralmagic/deepsparse" target="_blank" rel="noopener" title="">here</a>.</p>



<p><strong>Step 1.  Clone the Repository</strong>.</p>



<p>Open your terminal or command prompt and run the following command: </p>



<pre class="wp-block-code"><code>git clone https://github.com/slepix/neuralmagic-linode</code></pre>



<p>This code will deploy <strong>2 x Dedicated 4 GB</strong> virtual machines and a <strong>Nodebalancer</strong>. It will also install Neural Magic&#8217;s DeepSparse runtime as a Linux service and  install &amp; configure Nginx to proxy requests to DeepSparse server listening on 127.0.0.1:5543. </p>



<p class="has-vivid-red-color has-text-color has-link-color wp-elements-f0a0be9819cd2bc070b1912e9e812e46"><strong>WARNING: THIS IS NOT PRODUCTION GRADE SERVER CONFIGURATION!</strong></p>



<p class="has-vivid-red-color has-text-color has-link-color wp-elements-42cceec5227b00bdbbe795cf30c2587a">It&#8217;s just a POC! Secure your servers and consult Neural Magic documentation if you want to go to production. </p>



<p><strong>Step 2. </strong>&#8211; <strong>Terraform init</strong></p>



<p>Navigate to the repo using the following command: </p>



<pre class="wp-block-code"><code>cd neuralmagic-linode</code></pre>



<p>If you haven&#8217;t already installed Terraform on your machine, you can download it from the <a href="https://developer.hashicorp.com/terraform/install?product_intent=terraform" target="_blank" rel="noopener" title="">official Terraform website</a> and follow the installation instructions for your operating system.</p>



<p><strong>Step 3. </strong></p>



<p>Initialize Terraform by running:</p>



<pre class="wp-block-code"><code>terraform init</code></pre>



<p><strong>Step 4. </strong>&#8211; <strong>Configure your Linode token</strong></p>



<p>Open <strong>variables.tf </strong>file and paste in your Linode token. If you don&#8217;t know how to create a Linode PAT, check this article <strong><a href="https://www.linode.com/docs/products/tools/api/guides/manage-api-tokens/" target="_blank" rel="noopener" title="">here</a></strong>. It should look similar like the picture. You can also adjust the region while you&#8217;re here <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f600.png" alt="😀" class="wp-smiley" style="height: 1em; max-height: 1em;" /></p>



<pre class="wp-block-preformatted">Token in the picture is not valid. It's just an example. </pre>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="772" height="275" src="https://blog.slepcevic.net/wp-content/uploads/2024/03/tokendemo.png" alt="" class="wp-image-293" srcset="https://blog.slepcevic.net/wp-content/uploads/2024/03/tokendemo.png 772w, https://blog.slepcevic.net/wp-content/uploads/2024/03/tokendemo-300x107.png 300w, https://blog.slepcevic.net/wp-content/uploads/2024/03/tokendemo-768x274.png 768w" sizes="auto, (max-width: 772px) 100vw, 772px" /></figure>



<p><strong>Step 5</strong> &#8211; <strong>Run Terraform apply</strong></p>



<p>After configuring your variables, you can apply the Terraform configuration by running:</p>



<pre class="wp-block-code"><code>terraform apply</code></pre>



<p>Terraform will show you a plan of the changes it intends to make. </p>



<p>Review the plan carefully, and if everything looks good, type &#8220;<code><strong>yes"</strong></code> and press Enter to apply the changes. Give it 5-6 minutes to finish everything and by visiting your Nodebalancer IP, you should be presented with a landing page for DeepSparse server API. </p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="518" src="https://blog.slepcevic.net/wp-content/uploads/2024/03/Screenshot-2024-03-31-233941-1024x518.png" alt="" class="wp-image-294" srcset="https://blog.slepcevic.net/wp-content/uploads/2024/03/Screenshot-2024-03-31-233941-1024x518.png 1024w, https://blog.slepcevic.net/wp-content/uploads/2024/03/Screenshot-2024-03-31-233941-300x152.png 300w, https://blog.slepcevic.net/wp-content/uploads/2024/03/Screenshot-2024-03-31-233941-768x388.png 768w, https://blog.slepcevic.net/wp-content/uploads/2024/03/Screenshot-2024-03-31-233941.png 1457w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<p><strong>Step 6. </strong></p>



<p>After the installation is done, it&#8217;s finally time to send some data to our API and see how it performs. </p>



<p>We can do that by using <strong>curl </strong>or <strong>invoke-webrequest</strong> if you&#8217;re on Windows and using Powershell. </p>



<p><strong>CURL: </strong></p>



<pre class="wp-block-code"><code>sentence="Neural Magic &amp; Akamai are cool!"
nodebalancer="172.233.34.110" #PUT YOUR NODEBALANCER IP HERE
curl -X POST http://$nodebalancer/v2/models/sentiment_analysis/infer -H "Content-Type: application/json" -d "{\"sequences\": \"$sentence\"}"</code></pre>



<p><strong>PowerShell:</strong></p>



<p></p>



<pre class="wp-block-code"><code>$sentence = "Neural Magic &amp; Akamai are cool!"
$nodebalancer = "172.233.34.110"

$path = "v2/models/sentiment_analysis/infer"
$api = "http://$nodebalancer/$path"
$body = @{
   sequences = $sentence
} | ConvertTo-Json

(Invoke-WebRequest -Uri $api -Method Post -ContentType "application/json" -Body $body -ErrorAction Stop).content</code></pre>



<p>In both cases make sure to paste in the <strong>IP address of the Nodebalancer</strong> you deployed and modify the sentence as you wish. </p>



<h2 class="wp-block-heading">Benchmark time!</h2>



<p>In the repository, I&#8217;ve included a file called movies.csv and three files; two PowerShell and one Python file.</p>



<p><strong>movies.zip</strong> &#8211; unzip this one in the same folder where your benchmark scripts are. </p>



<p><strong>analyze.ps1</strong> &#8211; PowerShell based benchmark, sends requests in serial &#8211; not performant. </p>



<p><strong>panalyze.ps1</strong> &#8211; PowerShell based benchmark, sends requests in parallel &#8211; better performant</p>



<p><strong>pypanalyze.py</strong> &#8211; Python based benchmark, sends requests in parallel &#8211; <strong>best performer (doh!) &lt;-use this</strong></p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="581" height="153" src="http://blog.slepcevic.net/wp-content/uploads/2024/04/Screenshot-2024-04-01-014021.png" alt="" class="wp-image-327" srcset="https://blog.slepcevic.net/wp-content/uploads/2024/04/Screenshot-2024-04-01-014021.png 581w, https://blog.slepcevic.net/wp-content/uploads/2024/04/Screenshot-2024-04-01-014021-300x79.png 300w" sizes="auto, (max-width: 581px) 100vw, 581px" /></figure>



<p>All you need to do to in order to kick off a benchmark is to update the the URL variable with your Nodebalancer IP and you&#8217;re off to the races. </p>



<h2 class="wp-block-heading">Does it scale?</h2>



<p><strong>Yes!</strong> For kicks I&#8217;ve added a third node and the same job finished in 825 seconds. Feel free to add as many nodes as you like and see what numbers you can get. Additionally, you can play with the number of workers in the Python file. </p>



<pre class="wp-block-preformatted">Note 1: python script has been written with the help of ChatGPT :) Results matched with my PowerShell version against verified smaller sample size(check note 2), so I'm gonna call it good :)
	 
Note 2: PowerShell versions don't handle some comments as they should and end up sending garbage to the API. Happens in 3% of the cases. Most probably some encoding/character issue which I couldn't be bothered to fix :)

Note3: Movies.csv file has been generated by using data from https://kaggle.com/

</pre>



<p>Cheers, </p>



<p>Alex. </p>



<p></p><p>The post <a href="https://blog.slepcevic.net/sentiment-analysis-of-40-thousand-movie-reviews-in-20-minutes-using-neural-magics-deepsparse-inference-runtime-and-linode/">Sentiment analysis of 40 thousand movie reviews in 20 minutes using Neural Magic’s DeepSparse inference runtime and Linode virtual machines.</a> first appeared on <a href="https://blog.slepcevic.net">Architect the cloud</a>.</p>]]></content:encoded>
					
					<wfw:commentRss>https://blog.slepcevic.net/sentiment-analysis-of-40-thousand-movie-reviews-in-20-minutes-using-neural-magics-deepsparse-inference-runtime-and-linode/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
	</channel>
</rss>

<!--
Performance optimized by W3 Total Cache. Learn more: https://www.boldgrid.com/w3-total-cache/?utm_source=w3tc&utm_medium=footer_comment&utm_campaign=free_plugin

Page Caching using Disk: Enhanced 
Lazy Loading (feed)

Served from: blog.slepcevic.net @ 2025-12-25 06:09:44 by W3 Total Cache
-->