Our sentiment analysis relies on various models to do feature extraction for the purposes of determining sentiment.
Ever wanted to extract images from web pages? Now you can with one simple API call. Repustate’s clean-html API call has been one of our most popular API calls since Day 1. It hasn’t been touched much as its performance was quite good from the get-go, but that changed recently. Now you can extract images as well as the text from any web page.
With each passing month, Repustate has grown in popularity. More and more businesses rely on Repustate to analyze the data that flows through their systems. A core tenant of Repustate has been that we never wanted to throttle our API if we didn’t have to; we wanted our users, particularly our paying customers, to be able to push the limits with our sentiment analysis API.
This is a short blog piece and really intended for fellow Go developers who stumble upon the same error, the dreaded “duplicate symbols” error. Currently, some of Repustate’s Go code is using cgo to talk to various C libraries. It’s a stop gap until we finish porting all C code to pure Go.
No matter how clever your algorithms and heuristics, having a ton of data trumps all. We’ve been mixing ingredients for a while at Repustate, trying to come up with the perfect recipe for our problem. Our goal is to determine what somebody is intending to buy based on what they write on various social media outlets.
A core function that any text analytics package needs is to do language detection. By language detection, we refer to the following problem: “Given an arbitrary piece of text of arbitrary length, determine in which language the text was written.” Might sound simple for a human, assuming you know a thing or two about languages, but we’re talking about computers here.
Often times you have to interact with programs that require passwords or some other input from the user. For security purposes, some programs will not read from stdin so you have to be creative. Enter "expect". Expect is a program written in Tcl that allows you to mimic a conversation you’d have with any number of programs.