Category Archives: Uncategorized

Shall we dance?

Our Team member, Luciano Straga, share with us about his passion for dancing. He told us how he manages to combine being a developer and a flexible dancer.

Tell us, when did you start dancing?

Four years ago, in a gym. Nothing professional but oriented to people to lose weight in a funny way; as Zumba today. Definitely, nothing to do with dancing.  After that, I decided to start at a dance academy.

What attracted you the most about the dance?

Totally different from the technical stuff I usually do, I consider myself as a different developer. I love technical things, but I need something that trains my brain in another way. Dancing has become a challenge for our mind, so you make more effort in learning rather than moving.

What is the rhythm that you most like to dance?

I like urban styles, so Hip Hop or Jazz Funk are the dance classes I take.

Have you taken dance lessons?

Yes, of course, I take not less than 5 per week. I try to dance every day. That’s the best part of working remotely in a flexible schema. I love taking classes in Los Angeles, the best place in the world with the best teachers.

When you took classes in Los Angeles, what was your expectation before taking them?

I was afraid to not to be prepared. The classes are design for professionals, so it’s always a challenge. L.A. is the hardest place to train. Luckily, I could make it, but I had to bring my best from minute zero.

What did you learn the most from your L.A. classes?

Taking those classes is the best thing you can do if you seriously like to dance. There you can learn how to take a class. It’s not a game it’s a professional thing. Teachers go incredibly fast, and you have to be extremely concentrated. The choreos are longer than here, and dancers are beasts.

Have you thought about becoming a professional dancer?

Never. I don’t like dance as a profession. I love what I do, and I’m better at software development. However, I train almost like a professional. It is a challenge for me, and I want to dance like a pro.

Do you have any projects related to the dance that you want to share with us?

No, I did not participate in any project. It is just only a challenge for me and a great workout.

 

Code documentation good practices

By Francisco Verastegui – Santex’ Technical Board Member

Code documentation is an important practice of the development process and it’s worth the effort in the long term as the application gets bigger and more complex, letting us save time and minimize the learning curve to understand the functionality of the API, libraries and applications. Here we explain 4 practices that we hope you embrace as part of your development process.

  1. Document your APIs in a simple and concise way

Libraries and APIs are made to be used by people that might not have time to read the code or just might not have access to it, so documentation should reflect your code objectives in a simple (easy to understand) and concise (focusing on the important facts) way.

  1. Keep your documentation code up-to-date

Update your documentation each time you change your code – especially if business rules are affected. Software evolves over time, and so does the code. Therefore it’s important not to start documenting too early in the stages of the code because you might be forced to change it a lot of times.

  1. Focus on the ‘Why’ not the ‘How’

The main idea of this principle is: “Your code documentation should explain the ‘Why’ and your code the ‘How’”.

Good source code can be self-explanatory, but we should give it meaning. So we shouldn’t repeat the how. The following examples explain the same method with different code documentation approaches. The examples are in Java, but we are able to apply these concepts to any other programming language as well.

Example 1

In this case, the code documentation (JavaDoc) just explains the ‘How.’ The context isn’t clear, and neither are the business rules that are the reason of the creation of the method. Basically, the documentation is providing the same information that we could get reading the code.

Example 2

In this example, the method’s JavaDoc focuses on the ‘Why,’ explaining the context and the business rules that support it. It is also important to explain the business reason behind an exception that the method might throw.

Detailed explanation

“When we are editing a recurring series”: This is the context – whether to include it or not will depend on if it is a business-related method or just an ‘isolated’ method like the ones we can find in a utility class (reused by different parts of our code).

“we have to enforce the rule that recurring {@link Order}s can’t exceed a period of more than 24 hours”: This is the main part providing the ‘Why’ because it explains a business rule and the main reason for creating the method. A method can explain, or be supported by, more than one business rule.

“If the remove TIME portion is less than the install TIME portion, then it is safe to assume that the remove date has rolled onto the next day (e.g. June 1st 7PM -TO- June 2nd 3AM, is still a 24 hour period)”: Business rule considerations are important to have a good understanding of the method behavior. To include it or not will depend on the complexity and conditions of the rule we are trying to code.

@throws OrderEditException

– if the order was already deleted by a different user: Explanation of the reason (Why) the method is throwing a specific type of exception. It is recommended to do this for any application business exception.

It is important to realize that it is perfectly possible to understand the meaning and the business implications of the method just by reading the code documentation. This is a key concept for APIs that are public and designed to be reused throughout different projects and applications.

  1. Don’t document trivial code

Avoid documenting getter or setter method (unless it does something business-related), so remove it from your IDE’s auto-generated code template. Avoid documenting simple procedures perfectly explained in reading the code. For example:

As you can, see doing this only makes code harder to read.

Deep learning and vision, from simple manipulation to image classification: Part 2

Introduction:

After we revisited some basic concepts related to computer visión in our previous post, it is time to move forward and explore more sophisticated algorithms that will recognize either a dog or cat in a given image.

Through this post, we will work with the Dogs vs Cats problem from Kaggle and its data, which can be found here. You’ll need to register with Kaggle in order to download the train and test data.

After you register and download the data, we’ll perform an exploratory analysis and then build, train and evaluate a convolutional neural network for binomial classification. The model will output 0 or 1 for the cases when it determines that the image contains a dog or cat respectively.

[Step 1] Data exploration:

As stated before, data exploration is -most of the time- the first step before we even try to come with preliminary experiments. By just looking at the files in each of the files train.zip and test1.zip we’ve downloaded, we can spot the following details:

Table 1: Initial dataset observations

As our test set is not labeled, it will not be possible for us to use it for getting performance metrics. The files will, therefore, be only used to generate the final submission file for the Kaggle judge.

Another important observation we can make by opening some of the images in the test and train sets is that they seem to be different in size and aspect ratio. In order to confirm this, we’ll randomly plot and compare some of them.

Snippet 1: Randomly plot images from the train set

train_path = "data/train"
images = glob.glob(os.path.join(train_path, "*.jpg"))

plt.figure(figsize=(16, 8))
for index in range(6):
    plt.subplot(2, 3, index+1)
    img_index=int(np.random.uniform(0, 24999))
    plt.imshow(plt.imread(images[img_index]))
    plt.title(images[img_index])

Figure 1: Sample images from the training set

As we run the above script several times, we observe that our intuition was right: images differ from each other in size and aspect ratio. Normalization seems to be needed but immediately several questions arise: What would be the size we will use for resizing and normalizing all the images so they can later be used to train our model? Wouldn’t the new size need to be determined so it works for both larger and smaller images? Finally, what proportion of images are small, medium or large?

To address those questions, we prepare the following script to get the distribution over height and width (in 100-pixel ranges) for each image in the train set:

Snippet 2: Distribution oversize in the training set

max_w=0
max_h=0
min_w=2048
min_h=2048

arr_h=[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
arr_w=[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

for img_index in range(len(images)):
    img=Image.open(images[img_index]).size
    img_w=img[0]
    img_h=img[1]
    
    arr_w[int(img_w / 100)-1] += 1
    arr_h[int(img_h / 100)-1] += 1

    if img_w > max_w: max_w = img_w
    elif img_w < min_w: min_w = img_w if img_h > max_h: max_h = img_h
    elif img_h < min_h: min_h = img_h

print("Max Width: %i - Min Width: %i \nMax Height: %i - Min Height: %i" % (max_w, min_w, max_h, min_h))

If we plot the arr_w and arr_h vectors containing the number of images with width and height ranging from 0 to 1,000 pixels (in 100-pixel intervals), we observe that the majority of them are smaller than 400 x 400 pixel.

Figure 2: Height and width distributions

We can now come up with a strategy for resizing and padding our images as the only preprocessing task we’ll do before training our convolutional neural network. The resizeImg and padImg functions will maintain the original aspect ratio for each image while padding if necessary for images with different aspect ratios:

Snippet 3: Resizing and padding functions

resize_default=64

def resizeImg(image):
    
    img_w=image.size[0]
    img_h=image.size[1]
    slot = int(img_w/100) +1 if img_w > img_h else int(img_h/100) +1 
    
    if slot!=0:
        if img_w >= img_h:
            img = image.resize((resize_default,int(resize_default*img_h/img_w)), Image.ANTIALIAS)
        else:
            img = image.resize((int(resize_default*img_w/img_h),resize_default), Image.ANTIALIAS)
        
    return img;

def padImg(image):
    
    img_w=image.size[0]
    img_h=image.size[1]
    
    if img_w > resize_default or img_h > resize_default:
        if img_w >= img_h:
            new_size = (img_w, img_w)
        else:
            new_size = (img_h, img_h)
    else:
        new_size = (resize_default, resize_default)
        
    img = Image.new("RGB", new_size)
    img.paste(image, (int((new_size[0]-img_w)/2),int((new_size[1]-img_h)/2)))
        
    return img;


#testImage = Image.open(images[int(np.random.uniform(0, 24999))])
testImage = Image.open(images[468])
resized = resizeImg(testImage)
padded = padImg(resized)

plt.figure(figsize=(12, 8))
plt.subplot(1, 3, 1)
plt.imshow(testImage)
plt.title("Original")
plt.subplot(1, 3, 2)
plt.imshow(resized)
plt.title("Resized")
plt.subplot(1, 3, 3)
plt.imshow(padded)
plt.title("Padded")

Calling both functions will have the following output:

Figure 3: Padding and resizing of images

All images will be resized to 64×64 pixels and if padded vertically or horizontally if necessary. We can batch process all images as a preliminary step or include the functions right before we provide the samples to the trainer when fitting the model.

[Step 2] Building the convolutional neural network:

Up to this point, we’re familiar with convolutions for image processing. We’ve also explored the data we have available and decided that padding and resizing are needed in order to provide our model a normalized input pattern. The 64×64 pixel image equals to 4,096 features (input neurons) we need to fit into a 2-class classifier. It means that for every 64×64 pixel image we feed into the convolutional network, it’ll try to predict whether the input data belong to the classes cat or dog.

In addition to the two functions we’ve already seen for resizing and padding, we’ll need some other ones before we train the network. The get_label and getXYBatch functions shown in Snippet 4 are explained below:

Get_label: as we’ll get an output vector for every input pattern (or image), it will have a 2-element vector shape. There are only two possible values for the resulting vector: [0, 1] and [1, 0]. The first one will count as “cat” whereas the second one as “dog” in terms of the result the network is predicting.

getXYBatch: given our computer don’t have infinite memory, allocating all the 25,000 images for training is just not possible. We’ll resize and pad batches of 60-to-500 images and then feed the trainer with them in the training steps.

Snippet 4: get_label and getXYBatch functions

# extract labels
# positives = [1, 0], negatives = [0, 1]
def get_label(path):
    if path.split('/')[-1:][0].startswith('cat'): 
        return np.array([1, 0])
    else:
        return np.array([0, 1])

def getXYBatch(X_input, Y_input, batch_size):   
    X_array = np.array(padImg(resizeImg(Image.open(X_input[0])))).reshape([-1]) / 255
    Y_array = Y_input[0]

    choice = np.random.choice(range(len(X_input)), batch_size, replace=False)
    for item in choice:
        tmpimg = np.array(padImg(resizeImg(Image.open(X_input[item])))).reshape([-1]) / 255
        X_array = np.vstack((X_array, tmpimg))
        Y_array = np.vstack((Y_array,Y_input[item]))

    X_array = X_array[1:]
    Y_array = Y_array[1:]
    
    X_array = X_array.reshape([-1,resize_default,resize_default,3])
    
    return X_array, Y_array;

Now we split the train set into two parts for actual training but also for validation. We’ll use a 10% of the training images to measure how well the model is performing after, let say, 100 iterations. The following code will do it for us:

Snippet 5: Splitting the training set

train_path = "data/train"
images = glob.glob(os.path.join(train_path, "*.jpg"))
random.shuffle(images)

# extract pixels
data_images = images
        
data_labels = np.array([get_label(p) for p in images])
data_labels_out = np.argmax(data_labels, 1)

print("Positive samples: %i\nNegative samples: %i \n" % (len(data_labels_out)-np.count_nonzero(data_labels_out)
                                                      , np.count_nonzero(data_labels_out)))
#Split Data Sets
X_train, X_test, y_train, y_test = train_test_split(data_images, data_labels, test_size=0.2)
y_train_out = np.argmax(y_train, 1)
y_test_out = np.argmax(y_test, 1)

Finally, before jumping into the model’s code itself (assuming we’re excited about it). We’ll define some convenience functions to simplify the layers construction:

dropout: turn off hidden neurons given a probability (only in the training phase).
weight_variable: variables for the neurons’ weights.
bias_variable: variables for the neurons’ biases.
conv2d: convolution between the input and weights, with strides 1 and padding ‘SAME’
max_pool_2x2: max pooling operation, keeps only the maximum elements after each convolutional layer.

Snippet 6: Common tensorflow methods

def dropout(x, prob, train_phase):
    return tf.cond(train_phase, 
                   lambda: tf.nn.dropout(x, prob),
                   lambda: x)

def weight_variable(shape):
  return tf.Variable(tf.truncated_normal(shape, stddev=0.1))

def bias_variable(shape):
  return tf.Variable(tf.constant(0.1, shape=shape))

def conv2d(x, W):
  return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')

def max_pool_2x2(x):
  return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],strides=[1, 2, 2, 1], padding='SAME')

Now let’s build the layers of the network. Our model will have an input layer followed by convolution and max-pooling layers. In the last part of the network architecture, we will flatten the feature maps and have a fully connected layer. A representation of the model is shown in Figure 4.

Figure 4: Neural Network Architecture

We define two x and y variables for the 64×64 pixel images. As they use the RGB schema (3 channels), the final shape for the input layer will be 64x64x3.

Snippet 7: Network implementation

sess = tf.InteractiveSession()

# tf Graph Input
x = tf.placeholder(tf.float32, [None,64,64,3]) 
y = tf.placeholder(tf.float32, [None, 2])

# dropout placeholder
keep_prob = tf.placeholder(tf.float32)

# train flag placeholder
train_phase = tf.placeholder(tf.bool) # For Batch Normalization

# Set model weights
W1 = weight_variable([3, 3, 3, 32])
b1 = bias_variable([32])

W2 = weight_variable([3, 3, 32, 64])
b2 = bias_variable([64])

W3 = weight_variable([3, 3, 64, 64])
b3 = bias_variable([64])

W4 = weight_variable([16 * 16 * 64, 512])
b4 = bias_variable([512])

W5 = weight_variable([512, 2])
b5 = bias_variable([2])

# hidden layers
conv1 = tf.nn.relu(conv2d(x, W1) + b1)
maxp1 = max_pool_2x2(conv1)

conv2 = tf.nn.relu(conv2d(maxp1, W2) + b2)
#maxp2 = max_pool_2x2(conv2)

conv3 = tf.nn.relu(conv2d(conv2, W3) + b3)
maxp3 = max_pool_2x2(conv3)

# fully connected
maxp3_flat = tf.reshape(maxp3, [-1, 16 * 16 * 64])

full1 = tf.nn.relu(tf.matmul(maxp3_flat, W4) + b4)
drop1 = tf.nn.dropout(full1, keep_prob)

#output
output = tf.matmul(drop1, W5) + b5
softmax=tf.nn.softmax(output)

loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=output, labels=y))

all_variables = tf.trainable_variables() 

As describing each function and method used will be tedious and make this post super long, feel free to browse in the Tensor Flow official documentation those you are interested in: https://www.tensorflow.org/api_docs/

You may also want to revisit some concepts related to earning and optimization such as Loss Functions, Stochastic Gradient Descent and Cross Entropy.

[Step 3] Training time:

Now we just need to define some hyperparameters and let the trainer fit the model to our training data. We’ll display the model accuracy after every 50 steps. Running the snippet below will show the training progress as shown in Figure 5.

Snippet 8: Trainer

# Hyper-parameters
training_steps = 2000
batch_size = 500
display_step = 100

# Mini-batch Gradient Descent
training_accuracy = []
training_loss     = []

for i in range(training_steps):
    
    X,Y = getXYBatch(X_train,y_train,batch_size)
    
    batch_accuracy, batch_loss, _ = sess.run([accuracy, loss, train_step],
                                             feed_dict={x:X, y:Y, train_phase: True, keep_prob: 1.0})
    training_accuracy.append(batch_accuracy)
    training_loss.append(batch_loss)
    # Displaying info
    if (i+1)%display_step == 0 or i == 0:
        print("Step %05d: accuracy=%.4f\tloss=%.6f\tlearning rate=%.6f" %
              (i+1, batch_accuracy, batch_loss, learning_rate.eval()))

save_path = saver.save(sess, "./saved/model2K .ckpt")
print("Model saved in file: %s" % save_path)      
        
plt.figure(figsize=(10,4))
plot_titles = ["Training accuracy", "Training Loss"]
for i, plot_data in enumerate([training_accuracy, training_loss]):
    plt.subplot(1, 2, i+1)
    plt.plot(plot_data)
    plt.title(plot_titles[i])

Figure 5: Progress while training

We can also plot the accuracy and loss at each training step. In an ideal scenario, the accuracy will increment over time whereas the loss will decrease.

Figure 6: Training accuracy and Loss

[Step 4] Using the model with different images:

Our final test consists of using the model with a completely new image that the model hasn’t seen before. We can browse for cats or dog on the internet and pass the images to the classifier using the following code:

Snippet 9: Using the model

y=tf.nn.softmax(output)
test_img = Image.open(X_test[int(np.random.uniform(0, len(X_test)))])

input_array = np.array(padImg(resizeImg(test_img))).reshape([-1]) / 255
input_array = input_array.reshape([-1,64,64,3])

softmax=tf.nn.softmax(output)
prediction = sess.run(tf.argmax(y, 1), feed_dict={x: input_array, train_phase: False, keep_prob: 1.0})
print("Predicted: " + ("Cat" if prediction[0]==0 else "Dog"))
test_img

Figure 7: Model output with an unseen image:

Hopefully, the model will predict accurately the class (cat or dog) for each image we input. However, there are several other techniques we can use from this point in order to make a more precise model.

Discussion:

In this post, we’ve built a convolutional neural network model capable of classifying images based on if they contain a cat or dog. While we didn’t revisit all the term and concepts required to fully understand what we coded and why it’s a good starting point to see how these techniques can be used in real-life scenarios. Have you ever seen a captcha asking you to click on images containing, let’s say, cars in order to verify you are not a bot? Can you think of other possible use cases for this type of binary classification?

Find the full code of this post at: https://github.com/gariem/samples/tree/master/meetup/santex-machinelearning/cats-dogs

You heard it through the grapevine…

We have a couple of wine aficionados in Santex – Mariano Brolio and Maxi Sbrocca! This month, we asked them to tell us a bit more about their particular taste for the subject.

When did you first discover your love for wine?

Mariano Brolio: About when I turned 22. That’s when I started noticing different varietals (wine made by different types of grapes) and observing what I liked about each one.

Maxi Sbrocca: I think I’ve liked wine ever since I was a kid in high school — probably around 16 or 17 years old. Basically it’s something I’ve always enjoyed. What’s changed over time is the quality of wine that I like!

How would you describe your expertise? Have you ever taken formal classes?

MB: I don’t really consider myself to be an expert. It’s just that over the years I’ve learned which wines I like and which ones I don’t. I never took any lessons or classes, but I have gone to different degustations and events where they teach you what kind of food pairs well with each variety of wine, but these were always in relaxed and informal settings.

MS: I believe that the world of wine is one of great dimensions. Truthfully, I consider myself to be a novice when it comes to wine. I like it a lot, I enjoy it even more, but I’m a long ways from being a person who can provide recommendations about wine. I simply like talking about the wines that I like. I’ve never taken any classes, but I wouldn’t be opposed to doing so at some point.

Which Argentinian wine would you recommend to a friend traveling from abroad?

MB: It depends on what he/she is looking for. The top selections from the most well known bodegas are all really good. One wine that I always recommend is Altura Máxima. It comes from a finca, or ranch-like estate, in the Calchaquíes Valley at almost 10,000ft above sea level and is one of the best wines I have ever tasted. It’s produced onlyin relatively small batches, so I can never find a place to get it outside of Salta. If you can’t try that one, then the collection Fincas Notables from the bodega El Esteco are also really good.

If you haven’t noticed, I almost always recommend wine from the Salta region. They tend to be my favorite.

MS: As I said with the previous question, Ithink I’m a beginner when it comes to being able to recommend kinds of wine. But if I had to recommend one to someone, it would undoubtedly be a Malbec. Ruttini and Catena Zapata are good premium brands.

Have you tried wine from other places around the world? Which regions do you prefer?

MB: I’ve tried wine from Chile, Australia, and the U.S. I didn’t really care for any of them very much – I’ve tried much better ones in Argentina. I’ve also tried a reserve wine from Spain that a friend brought over from a trip. It was pretty good, but again, you can find similar Argentinian wines that are similar or better. Maybe my palate it just used to Argentinian wine haha.

MS: I’ve tried wine from the U.S. and to tell the truth, they weren’t very good. At least the ones that I tried weren’t. I’ve also tried Australian wine which was equally bad. Obviously with the wine we produce locally in Argentina, we have very high standards. There is a really good wine from Chile that’s a variety called Carmenere. They say that it’s the only place where that particular vine can be grown. It’s really tasty.

Is wine just a hobby for you, or do you see it turning into something else in the future?

MB: Yea, for me it’s just a hobby. A few years ago, a friend and I wanted to start our own business. Some boutique bodegas began to send us their products to sell them online and to family, but for the sake of time, we couldn’t keep up with it.

MS: Today, it’s just a hobby. As a consumer it’s something I enjoy, but I wouldn’t throw out the idea of it becoming something more in the future.

What is your favorite way to enjoy a glass of wine?

MB: There are two moments that I always enjoy most with wine:
– At an asado with friends, enjoying different bottles
– Once the day is over and I can relax and drink a glass, maybe with some nice cheese or pasta.

MS: I think the best way to enjoy a glass of wine is at a great asado surrounded by friends and family. I view wine as a means, not an end. It’s something that brings friendships together even more. This doesn’t, however, exclude those special occasions when you can enjoy an excellent glass of wine alone.

Internet of Things: Challenges

By Sebastian Pereira, Information Systems & Processes at Santex

The Internet of Things, or IoT, is the inter-networking of physical devices, vehicles, buildings and other items embedded with electronics, software, sensors, actuators, and network connectivity which enable these objects to collect and exchange data.

Wikipedia states that the first discussion around “a thing” that was interconnected was a Coke machine that would determine how many drinks were available.

IoT use cases include several applications that range from connected homes, consumer electronics, industries, retail, logistics, government, automobiles, among other areas.

But if it’s been around for so many years, perhaps with different names but with the same discussions in the background, why we are not surrounded by all these things already? Jayson DeMers in his Forbes article states that part of the problem is too much competition with not enough collaboration –  something that discourages and delays the revolution. It is also encouraging bigger companies such as Google, Amazon and Apple, and taking a step into the problem, that they might drive that collaboration with their own applications and infrastructure.

This  competition and little collaboration can be seen as symptoms of a much bigger problem that pose a series of challenges that everybody will have to solve and agree upon for the revolution to come. We are talking about a potential market of billions of dollars for services and applications that range from apparent trivial things to more complex and with higher impacts on societies and economies.

Following are the top 3 challenges I think we have on the horizon with IoT:

  • Security. This is mainly based on the fact that IoT implies more and more devices will be connected through networks, and therefore there will be more opportunities for hackers to exploit. The one thing we learned over the years is that no one is secure, and basically more interconnected devices means more problems – at least more problems to be solved.

To serve as a terrifying example – one of many – take a look at this article, where hackers stopped a journalist’s car in the middle of a highway to prove their point: they can control a car remotely, mentioning this scary scenario: “Imagine last year if instead of cutting the transmission on the highway, we’d turned the wheel 180 degrees,” says Chris Valasek. I can imagine. But he spells it out anyway. “You wouldn’t be on the phone with us. You’d be dead.”

  • Connectivity. If we go back to the definition stated at the beginning of this article, IoT implies that a lot of devices are interconnected, and aside from the security challenge, this means there will be more efficient ways to have all of these devices talk to each other. The (now) classic way of connecting devices as a centralized paradigm for authorization and connection of different nodes within a network may be heading to a decentralized model, such as Fog Computing. With this,  data, computing, storage, and applications are distributed in the most logical, efficient way between the data source and the cloud. There are other solutions where the decentralization is higher, but certainly each option is tied to different security challenges.
  • Making sense of the data. Does Big Data sounds big enough? Imagine something even bigger than. IoT will pose the challenge of cleaning, processing and interpreting vast amounts of data produced by sensors. Gartner estimates that there will be 25 billion connected things in use by 2020.  Internet networking specialist Cisco ISBG’s forecast is of 50 billion connected devices. To have an example of this from the airlines industry, at last year’s Paris Air Show, Bombardier showcased its C Series jetliner that carries Pratt & Whitney’s Geared Turbo Fan (GTF) engine, which is fitted with 5,000 sensors that generate up to 10 GB of data per second.

So basically, applying analytics to IoT has the same general approach on a different scale: how data is collected and how it is interpreted/how can be used. This Forbes article states that applying Machine Learning to analyze the data will be a more efficient option. Given that the current manufacturers of sensors and IoT applications are not experts on analyzing and getting good quality information that can be actionable, new services start to appear on the horizon. One example is Machine Learning As a Service, where buyers can quickly get the insights they need without making huge investments in technologies that are not core to their business.

Of course there are many other challenges with IoT and many problems to solve, but that’s what makes this so interesting to talk about it and work on. Other challenges that can be mentioned are:

  • Power
  • Sensors
  • Standards
  • User Experience and User Interface
  • Waste Disposal
  • Data Storage
  • Laws and Regulations

It certainly states a huge opportunity for business, whether for good or bad purposes.

Sources:

Things to keep in mind before adding a Software Dependency to your project

By Agustin Aliaga, Mobile Developer at Santex

In my work experience, one basic thing I learned about software engineering is that you don’t need to “reinvent the wheel” every time you want to achieve some functionality. Open source projects have revolutionized the way we work in that we can reutilize existing software in addition to collaborating with others devs. In the web-development ecosystem, there are plenty of frameworks and tools that already simplify things like user authentication, routes, templating (client-side and server-side), state-management, database queries, web sockets, etc. On the other hand, however, sometimes the existing solutions are just not good enough or it may be that there are no alternatives at all, but that’s a completely different story.

The ability to know when to implement the feature yourself and when to use an existing solution will be a crucial asset for your team. Adopting a new library, language or technology as a dependency to build your product without extensive research could become headache in the future, so you should always ask yourself at least these questions about it:

1. Does it meet all your needs?
Sometimes you’ll find a solution for your problem that does not cover all the specific features you need. In that case, you might have to deal with forking and extending it (if it’s an open source project), and this means greater time investments and costs. Are your team and the client prepared for this scenario?

2. Is there any documentation?
If so, is it well documented? Just as an example, one of the things I like the most about Django (web framework) is the quality they put into the docs. It’s remarkably easy to find the topics you need for the framework version you’re using. https://docs.djangoproject.com/en/.

3. Is it supported by a “big” community and/or a private company? Having a company or a community behind it helps a lot when you’re having trouble and need assistance from others. You may have to send a “help-desk” ticket (probably if it’s a paid service), find information on blogs or StackOverflow, or maybe even post a question to those sites. If you’re relying on the community to help you, your chances of being helped are proportional to the popularity of the software dependency.

4. Is it an “external service”?
If you rely on services like “Google Maps API”, “Facebook Graph API”, “Google’s Firebase”, etc. be aware that they may change in the future without notice, or they could just stop working at any time (temporarily or permanently). SaaS/BaaS solutions are great but you should think twice before setting them up as a critical piece of your system. Just as an example, read about what happened to Facebook’s Parse: (https://techcrunch.com/2016/01/28/facebook-shutters-its-parse-developer-platform/).

5. Is it actively maintained and improved?
If hosted on Github, “Pulse” and “Graphs” tabs will give you an idea of the latest activity. You probably don’t want to set up an outdated library, because it could bring retrocompatibility issues to your project. Also, if it’s constantly evolving, sometimes it could mean you’ll have to update your code repeatedly.

6. Is it tested?
Some libraries use automated tools to build and test every change that is introduced (applying continuous integration tools like Travis CI, Circle CI, etc.). This makes the library more reliable.

7. Are you compromising another dependency if you adopt this new library?
Sometimes libraries don’t play well together.

8. Will it affect your product’s performance, speed, size, etc.?
You should always take this into consideration. In the “web environment”, a giant front-end library could affect the browser’s performance and also increase network transfer times. On the back-end side, you want to avoid server overloading. In the mobile world, things get even more critical because mobile phones don’t have as many resources as a desktop computer. In Android, an app that wastes memory and CPU is a real candidate to be killed automatically by the operating system.

What about Android ?

The core-functionalities that Android brings to the table are sometimes more than enough to build simple applications. You could build an entire app by using bare Activities, Fragments, Views, AsyncTasks, Services, Content Providers, Broadcast Receivers, etc.

But in my experience, sometimes this means you’ll have to write (and then maintain) a lot of repetitive/boilerplate code. In other words, sometimes sticking to the core framework means you will have to invest more time to taking care of all the details. Some examples of libraries that made me more productive in Android development are: Retrofit, Dagger 2, and Butter Knife.

You should also know that if you add too much direct and transitive dependencies (plus your own code), you might exceed the “64K-method limit”, explained by Android documentation:

Android app (APK) files contain executable bytecode files in the form of Dalvik Executable (DEX) files, which contain the compiled code used to run your app. The Dalvik Executable specification limits the total number of methods that can be referenced within a single DEX file to 65,536—including Android framework methods, library methods, and methods in your own code. In the context of computer science, the term Kilo, K, denotes 1024 (or 2^10). Because 65,536 is equal to 64 X 1024, this limit is referred to as the ’64K reference limit’.”

If you exceed this limit, you’ll have to change your application to support “multidex”, which means it will have to load multiple dex files. This will produce higher compilation times, and also performance/loading issues in the app itself. So I’d recommend to always be careful with the dependencies that you add to your Android project.

Conclusion

I have seen these concepts apply not only to Android development (a technology I use every day at work), but to all software development in general. Every product relies on dependencies (whether it’s an OS, a service, a framework, a library or some other kind of software). The goal is to pick the best ones to maximize your productivity, without affecting your product’s performance, scalability, and capacity to evolve over time.

When (in my opinion) Not to Use Drupal

By Sebastian Gonzalez – Drupal Developer at Santex

I would like to clarify first off that I love to work with Drupal. I’ve been working with the Drupal platform for about 10 years now, and through all those years of getting frustrated over the same things, I realized something. I noticed that when certain clients or businesses had a previous project in Drupal that was successful, they would want to handle any future projects in the same way, when in reality Drupal may not have been the best tool to use.

In all these years of experience, I came across various projects and had a lot of different experiences – some very rewarding and others not so great. In some of these last projects that I didn’t think were so great, I noticed that something kept repeating. Drupal was being used for any kind of project on the simple premise that “it can do everything.”

If a client needs just any sort of app, we as developers usually say that Drupal is the solution. But what we should is is that Drupal could be the solution. Changing the message from “Drupal can do that” to “Drupal should be able to do that” is fundamental to starting any project off on the right foot.

Drupal is a CMS (Content Management System) that was intended to be a content administrator. Every page in the world has content, and when we talk about ‘content,’ we automatically think that it should be able to be handled administratively. This leads one to automatically think of a CMS like Drupal, WordPress, or Joomla. For me, the important question is what you want to do with the content. Where is this going and what is it going to be used for?

A lot of people view Drupal as a CWMS (Content Workflow Management System), and I agree with this vision. In my opinion, it makes sense to use Drupal when a business’s domain entails a lot of different types of content with multiple users who have different levels of permission. All of these users can alter the state of the content, making it fluctuate through different phases of the workflow where there aren’t annotations, reports, or emails involved.

The reality is that the vast majority of websites built using Drupal should not have used Drupal. This is not because Drupal can’t do the job, but rather because it’s a waste of all its functionalities that end up not getting used. A clear case of this is with classic brochure websites or institutional sites where the content is static and hardly changes over time. There isn’t much interaction between users beyond basic contact forms or a comments section.

Our world is currently dominated by mobile devices. Drupal was able to enter into the competition with its latest version 8, which came out in November 2016. Using and integrating components with the popular framework Symfony provides a robust back-end to facilitate API development. Drupal is jumping onto this trend with something called Headless – an architecture that uses Drupal as the back-end paired with a framework to present the data, which could be AngularJS, React.js, or any other framework.

In summary, I believe Drupal should not be used for:

  • Simple brochure websites
  • Single-purpose apps (like a chat application)
  • Gaming apps

I think Drupal should be used for:

  • News websites with multiple users
  • Multi-user publishing apps
  • Any app or website that includes workflows among people with different roles/permissions
  • A mobile version for Drupal

To conclude, here are 4 more pieces of advice:

  • Choosing one tool or another has to do with understanding the business’s control over the application or website. The more you know about the project, the greater the decision power in choosing which platform to use to meet those needs.
  • Use Drupal from the start and don’t try to switch and start using it for something else when things are not properly in place.
  • Stop saying “Drupal is the solution” and starting saying “Drupal could be the solution!”
  • Always explore alternatives because new technologies are coming out everyday.

Those are my two cents.

 

About the Author – Sebastian Gonzalez is an experienced Drupal Developer at Santex,  passionate about his work.  Sebastian is a strategic team player always willing to contribute and to solve problems.

7 Tips for Automation Testing

Luckily today, the term Automation is becoming more common and popular in the immense world of IT companies. You just have to search a little bit in the web to find hundreds or thousands of articles in all languages talking about the benefits of automated testing and how much money companies can save using it, so it is not my idea to repeat the comments of my colleagues, but rather to share some of my experiences across more than 5 years of working as a QA.

I worked on 3 giant projects: the website of a major airline, a video on-demand provider, and a security application of one of the most famous antivirus services. I also participated in small projects where manually running the same test suites every day, up to 3 times a day, made me realize how necessary and beneficial it is to automate.

Automation Blog image

Here are 7 tips I learned from automating that I would like to share with you:

  1. The Code Reviews of other QA and/or Developers as well as those from POP or the BA are of GREAT importance.

  2. Reuse code. Writing the same code over and over again can be a waste of time when the changes in the data set are minimal.

  3. The tests have to be fail-proof, they should only fail due to errors in the product, environment, etc. and not because of a bad analysis made before creating it. This also includes the Unit Test.

  4. Ask for help. We are all proud people and it is a huge satisfaction to complete a challenging task without having to turn to someone for help, but sometimes pride translates into hours that only lead to losing time in the sprint, money for the client and the company, and can even delay the tasks of our peers.

  5. Respect good practices. When working as a team we must remember that our code can affect the code or work of others.

  6. Automated tests are not only a good tool for testers but also, when used correctly, can be very useful for developers.

  7. Adapting is very important. Sometimes because of licensing issues or for a number of other reasons, we may have to automate in a language with which we do not feel comfortable or simply do not like. Despite not enjoying it when it happened to me, I understood that the language was the right one for the software to be tested, and today I can say that at least I have some experience in other languages and technologies that will surely be useful again throughout my career.

Hopefully these tips can help testers and developers who are not yet familiar with Automation to understand more about its importance. At Santex, we are always open to sharing knowledge and listening to new experiences and opinions, so feel free to leave your thoughts on automation.

About the author: Mauricio Ardiles is an enthusiastic QA Analyst seasoned in a variety of testing skills. Strong background in automation testing and a certified Scrum Master. 

 

One day of Singularity University at The Tech Pub

Blog1

This month at Santex, we had the pleasure of hosting Peter Wicher, Director of Strategic Relations at Singularity University. Over his extensive career, Peter has held executive positions in Silicon Valley in the industries of consumer electronics, semiconductors, education, and integrated systems.

We asked our Director of Operations, Eduardo Coll, to tell us a bit about the experience.

SANTEX: Tell us a bit about what Singularity University is and what it stands for.

Eduardo Coll: We had the pleasure of having Peter Wicher from Singularity University (SU) visit our Tech Pub. SU is a university created and founded by Peter Diamandis and has sponsorship from Google, NASA, and Autodesk, among other large companies.

Its vision is to create technology solutions that can make a positive impact on a billion people. There are 7 billion people in the world, and SU is trying to impact 1 billion. The university has unique programs for both individuals and organizations. They also have an incubator where they work on projects across verticals, which are selected because they aim to resolve some of the great problems facing humanity today – like renewable energy and access to drinking water. One of the concepts that they promote in SU, which Peter Wicher explained to us during his visit, is the concept of “exponentiality” – exponential technology and exponential growth.

SANTEX: Elaborate more on this concept. Where does this exponential growth in technology go?

EC: For humans by nature, it’s easier to understand linear growth. We grow older lineally; we don’t pass from 2 to 50 years old without turning 3, 4, etc. We physically grow in the same way, as we gain weight in successive numbers, and therefore we tend to forget that there can also be exponential growth in certain things. SU strives to leave its students with the ideology that exponential growth can be applied to our daily lives and that we don’t have to act within a linear mindset. This enables people to more rapidly achieve their visions, and helps the university reach its goal of impacting 1 billion people.

The motto at SU is “don’t make something better by 10%, make it 10x better!” When you think about changing or creating something, you don’t have to make something new that’s 10% better than what already exists. You should strive to make it 10 times better! When you do this, your business or your technology or idea will grow exponentially. Some familiar examples that they point out are: Uber generated a revolution in the transportation industry using technology that was already over 10 years old – cell phones, GPS, web services. They simply made it into an app and the number of trips, drivers, and passengers increased exponentially. They created an exponentially better transportation system.

SU also mentions Airbnb, who also revolutionized the hospitality system with the idea of renting homes and individual rooms for a lesser cost than staying in a hotel or vacation house. These technologies and systems are disruptive, and obviously have their flaws. They have problems that occasionally  need to be fixed that need to be observed in their entirety, but we’ll save that for another article.

SANTEX: How do some of concepts learned during SU’s visit pertain to Santex currently?

EC: Peter brought us some lessons that really dazzled us, lessons that we should apply to our lives daily both as an organization and individually. With existing technology and new technologies to come, the objectives that SU presented to us will be made simple.

At Santex, we are beginning to understand and work with the concept of exponential growth and the use of technology to solve some of humanity’s biggest problems. As a company, we want to support our clients with the knowledge needed to help them grow, which in turn will help us grow as well. We’ve set the goals of improving 10x more for every new project that we take on, and striving to solve problems that help not only our clients, but us as an organization and the community we’re involved in.

About Singularity University (SU)

SU is an academic institution in Silicon Valley whose focus is to bring together, educate, and inspire leaders about the exponential power of technology to solve some of the greatest challenges facing mankind.

A journey to fitness and health

By Sebastian Gonzalez – Drupal Developer at Santex

My journey begins in early 2014, at the time I went to see a dermatologist for a few spots that I found on my legs. The dermatologist told me that it was acanthosis pigmentaria, and that they were due to my being overweight. At that time, my weight was about 110 kilos (242 lbs.) or a little more, had a very poor diet – lots of take out, lots of soda, many processed meals (bread, pastries), only a few vegetables and no fruit.

Sebastian GonzalezparablogchicaI signed up for a gym without a clear idea of what to do: that I should lose weight. At first, they recommended that I start on the stationary bike in addition to an exercise routine that was, at the very least, boring. Those were the first few weeks, and then little by little I began to like the routines that the coach was giving me and I started to enjoy training at the gym. In the meantime, the coach started demanding more and more of me.

When it came to my diet, I knew that it was the key to losing weight. Little by little I started changing certain eating habits. I started to plan my meals and set a day when I could eat more fattening foods (I still eat sandwiches and pizzas on my “cheat” days). I had to change my breakfast habits too – I used to not eat breakfast at all or just grab something quick on the way to work. I stopped drinking soda and eating certain foods at night.

Within the first 14 months, I lost almost 30 kilos (66 lbs). At a first glance, this seems like a large number, but I did some calculations and it evens out to about 500 grams (about 1 lb) per week. The truth is that weight loss is not a consistent progression. Sometimes I lost 500g per week, and afterwards weeks would go by where I would plateau and not lose even 1 gram. It’s times like that when you need to learn to be persistent and not give up, and just continue trying day after day. It’s not an easy journey that one sets out on to try to change or improve one’s life, but the people by my side supported me and encouraged along the way, and they didn’t let me fall. It’s also nice to hear when friends or coworkers comment on the weight you’ve lost and how they notice your body changing. That’s encouraging.

Today, my food plan is quite varied. My breakfasts usually include yogurt, fruit, granola, and peanut butter. My lunches focus on protein like beef, chicken, pork, and eggs, in addition to a healthy source of carbohydrates like broccoli, spinach, sweet potatoes, and brown rice. Dinners also revolve around a protein and lots of veggies like lettuce, tomato, carrots, arugula, cabbage, etc.

When people ask me how I lost so much weight and totally changed my body, the first thing I say is that you have to have persistence and the drive to keep going. There are going to be difficult moments in which you feel anxious and want to buy everything at the grocery store! But you’ll see how your body changes bit by bit and you’ll feel stronger and have more energy. You won’t get as tired at work – you’ll sleep better at night and feel more awake during the day. You can’t put a price on those feelings. I always recommend doing a physical activity, whatever it may be. If you like soccer, play it. If you like to run, go do it. If you’re a gym person like myself, be methodical with your training plan. Whatever the activity you prefer, the important thing is to stick to it.

My journey continues, and each day I strive to improve my diet and improve my training. I try to communicate my life experiences with others so that they feel motivated to become more physically active. With a little bit of perseverance you can achieve big things.