Help
RSS
API
Feed
Maltego
Contact
Domain > adlersantos.github.io
×
Welcome!
Right click nodes and scroll the mouse to navigate the graph.
×
More information on this domain is in
AlienVault OTX
Is this malicious?
Yes
No
DNS Resolutions
Date
IP Address
2018-03-12
151.101.45.147
(
ClassC
)
2024-08-24
185.199.111.153
(
ClassC
)
Port 80
HTTP/1.1 200 OKConnection: keep-aliveContent-Length: 18991Server: GitHub.comContent-Type: text/html; charsetutf-8permissions-policy: interest-cohort()Last-Modified: Fri, 30 Mar 2018 23:49:34 GMTAccess-Control-Allow-Origin: *ETag: 5abecd0e-4a2fexpires: Sat, 24 Aug 2024 09:02:28 GMTCache-Control: max-age600x-proxy-cache: MISSX-GitHub-Request-Id: BEAC:2AB063:3EECAED:4091703:66C99F4CAccept-Ranges: bytesAge: 0Date: Sat, 24 Aug 2024 08:52:28 GMTVia: 1.1 varnishX-Served-By: cache-bfi-kbfi7400050-BFIX-Cache: MISSX-Cache-Hits: 0X-Timer: S1724489549.612801,VS0,VE87Vary: Accept-EncodingX-Fastly-Request-ID: f2a20fc03f3cdd22ec9323bb796687062d465215 !DOCTYPE html>html> head> meta charsetutf-8> meta http-equivX-UA-Compatible contentIEedge> meta nameviewport contentwidthdevice-width, initial-scale1> title>Writings/title> meta namedescription contentData Scientist and Software Engineer> !-- Google Fonts loaded here depending on setting in _data/options.yml true loads font, blank does not--> link href//fonts.googleapis.com/css?familyLato:400,400italic relstylesheet typetext/css> !-- Load up MathJax script if needed ... specify in /_data/options.yml file--> script typetext/javascript src//cdn.mathjax.org/mathjax/latest/MathJax.js?configTeX-AMS-MML_HTMLorMML>/script> link relstylesheet typetext/css href/css/tufte.css> !-- link relstylesheet typetext/css href/css/print.css mediaprint> --> link relcanonical href/>/head> body classfull-width> !--- Header and nav template site-wide -->header> nav classgroup> a href/>img classbadge src/assets/img/avatar.png altCH>/a> a classactive href/ classactive>Writings/a> a href/about/>About Adler/a> a href/css/print.css>/a> /nav>/header> h1>Writings/h1> article> ul classcontent-listing > li classlisting> hr classslender> a href/articles/18/transforming-python-list-to-spark-dataframe>h3 classcontrast>Transforming Python Lists into Spark Dataframes/h3>/a> !-- br>span classsmaller>March 30, 2018/span> br/> --> div>p>Data represented as dataframes are generally much easier to transform, filter, or write to a target source. In Spark, loading or querying data from a source will automatically be loaded as a dataframe./p>p>Here’s an example of loading, querying, and writing data using PySpark and SQL:/p>div classlanguage-python highlighter-rouge>pre classhighlight>code>span classkn>import/span> span classnn>pyspark/span>span classc># Define your SparkContext and SparkSession/span>span classn>sc/span> span classo>/span> span classn>pyspark/span>span classo>./span>span classn>context/span>span classo>./span>span classn>SparkContext/span>span classp>(/span>span classn>master/span>span classo>/span>span classs>host/span>span classp>,/span> span classn>appName/span>span classo>/span>span classs>Sample App/span>span classp>)/span>span classn>session/span> span classo>/span> span classn>pyspark/span>span classo>./span>span classn>sql/span>span classo>./span>span classn>session/span>span classo>./span>span classn>SparkSession/span>span classp>(/span>span classn>sc/span>span classp>)/span>span classs>Load your data using: - spark.read.json(some/path/or/url) - spark.read.parquet(some/path/or/url) - spark.read.csv(some/path/or/url) - spark.read.text(some/path/or/url), etc./span>span classn>data/span> span classo>/span> span classn>spark/span>span classo>./span>span classn>read/span>span classo>./span>span classn>json/span>span classp>(/span>span classs>some/path/or/url/span>span classp>)/span> span classn>data/span>span classo>./span>span classn>createOrReplaceTempView/span>span classp>(/span>span classs>table/span>span classp>)/span>span classc># Apply some SQL query to the data, which results in a DataFrame/span>span classn>df/span> span classo>/span> span classn>session/span>span classo>./span>span classn>sql/span>span classp>(/span>span classs> select col1, col2, sum(col3) from table where col4 some_val group by col1, col2/span>span classp>)/span>span classc># Write the query results to a target in your desired format (say, JSON)/span>span classn>df/span>span classo>./span>span classn>write/span>span classo>./span>span classn>json/span>span classp>(/span>span classs>target/path//span>span classp>)/span>/code>/pre>/div>p>The example above works conveniently if you can easily load your data as a dataframe using PySpark’s built-in functions. But sometimes you’re in a situation where your processed data ends up as a list of Python dictionaries, say when you weren’t required to use code classhighlighter-rouge>spark.read/code> and/or code classhighlighter-rouge>session.sql/code>. How can you load your data as a Spark DataFrame in order to take advantage of its capabilities?/p>/div> /li> li classlisting> hr classslender> a href/articles/18/logistic-regression-loss-cost-forms>h3 classcontrast>How the loss and cost functions got their forms in logistic regression/h3>/a> !-- br>span classsmaller>March 10, 2018/span> br/> --> div>p>In logistic regression, the loss function span>script typemath/tex>\mathcal{L}(\hat{y}, y)/script>/span> and the cost function J take the forms/p>div classmathblock>script typemath/tex; modedisplay>\begin{align}\mathcal{L}(\hat{y}, y) & - \left(y \log (\hat{y}) + (1 - y) \log (1 - \hat{y}) \right) \\J & \frac{1}{m} \sum_{i1}^m \mathcal{L}\left(\hat{y}^{(i)}, y^{(i)}\right)\end{align}/script>/div>p>where span>script typemath/tex>\hat{y} \sigma(W^\text{T} X + b)/script>/span> is the sigmoid of the linear superposition of the features represented in matrix form, with span>script typemath/tex>W/script>/span> as the vector of the feature weights, and span>script typemath/tex>b/script>/span> as the intercept term. The sigmoid of some function span>script typemath/tex>z/script>/span> is defined as/p>div classmathblock>script typemath/tex; modedisplay>\sigma(z) \frac{1}{1 + e^{-z}}./script>/div>p>To understand why span>script typemath/tex>\mathcal{L}/script>/span> and span>script typemath/tex>J/script>/span> take such forms, first note that span>script typemath/tex>\hat{y}/script>/span> is the probability of the binary classification variable span>script typemath/tex>y/script>/span> to be equal to a positive example (span>script typemath/tex>y 1/script>/span>)./p>/div> /li> li classlisting> hr classslender> a href/articles/18/deep-work-taking-breaks>h3 classcontrast>Using Breaks To Get More Deep Work Done/h3>/a> !-- br>span classsmaller>January 27, 2018/span> br/> --> div>p>It’s a great thing to constantly have goals that require prolonged periods of deep concentration. This is something I always look forward to. Deep work gives us a sense of great accomplishment when we’re finished, as well as having expanded our expertise on the domains we’ve tackled during the process./p>p>But of course, many of us don’t buy into the “delayed gratification” thing perhaps due to biological and historical reasonslabel forhistory-drag classmargin-toggle sidenote-number>/label>input typecheckbox idhistory-drag classmargin-toggle />span classsidenote>Sadly, many of us never liked school in the past, and would be happy to never go back to school again. /span>. Nature, by default, always follows the path of lowest energy and/or least resistance./p>/div> /li> li classlisting> hr classslender> a href/articles/18/how-are-bubbles-formed>h3 classcontrast>How Are Bubbles Formed?/h3>/a> !-- br>span classsmaller>January 8, 2018/span> br/> --> div>p>I’ve always thought of a bubble as strong>a compounded result of residual greed when the optimism of the many are perpetually validated/strong>./p>p>This reminds me of what Warren Buffett tells us to do when we see compelling evidence of an impending bubble:/p>blockquote> p>“Be fearful when others are greedy, and greedy when others are fearful.” – Warren Buffett/p>/blockquote>p>But only a few have the discipline to do this, because everyone can easily forget about the principle when they’re constantly seduced by social proofs of getting rich by everyone they know, everywhere they look./p>/div> /li> li classlisting> hr classslender> a href/articles/17/changing-your-behaviour-not-who-you-are>h3 classcontrast>On Change vs. Being Who You Are/h3>/a> !-- br>span classsmaller>August 20, 2017/span> br/> --> div>p>One of the things I’ve more understood lately was the concept of adapting how I behave to the context I was in. I’m aware that I’ve been doing this, but I was also constantly questioning it./p>p>I’ve been wondering if I might be betraying myself, or perhaps this is psychologically unhealthy in the long term, to act against my familiar behavioural inclinations./p>/div> /li> li classlisting> hr classslender> a href/articles/17/machine-learning-entropy-and-classification>h3 classcontrast>Machine Learning: Entropy and Classification/h3>/a> !-- br>span classsmaller>April 2, 2017/span> br/> --> div>h2 ida-simple-classification-example>A Simple Classification Example/h2>p>Let’s say we have a dataset with categorical features span>script typemath/tex>P/script>/span>, span>script typemath/tex>Q/script>/span>, span>script typemath/tex>R/script>/span>, and a binary target variable span>script typemath/tex>Z/script>/span>:/p>p>label for classmargin-toggle> ⊕/label>input typecheckbox classmargin-toggle />span classmarginnote> /span>/p>table> thead> tr> th styletext-align: right>id/th> th styletext-align: center>Feature span>script typemath/tex>P/script>/span>/th> th styletext-align: center>Feature span>script typemath/tex>Q/script>/span>/th> th styletext-align: center>Feature span>script typemath/tex>R/script>/span>/th> th styletext-align: center>Target Variable span>script typemath/tex>Z/script>/span>/th> /tr> /thead> tbody> tr> td styletext-align: right>1/td> td styletext-align: center>a/td> td styletext-align: center>c/td> td styletext-align: center>e/td> td styletext-align: center>span>script typemath/tex>G/script>/span>/td> /tr> tr> td styletext-align: right>2/td> td styletext-align: center>b/td> td styletext-align: center>d/td> td styletext-align: center>e/td> td styletext-align: center>span>script typemath/tex>G/script>/span>/td> /tr> tr> td styletext-align: right>3/td> td styletext-align: center>b/td> td styletext-align: center>d/td> td styletext-align: center>f/td> td styletext-align: center>span>script typemath/tex>H/script>/span>/td> /tr> tr> td styletext-align: right>4/td> td styletext-align: center>a/td> td styletext-align: center>d/td> td styletext-align: center>e/td> td styletext-align: center>span>script typemath/tex>G/script>/span>/td> /tr> tr> td styletext-align: right>5/td> td styletext-align: center>a/td> td styletext-align: center>c/td> td styletext-align: center>f/td> td styletext-align: center>span>script typemath/tex>H/script>/span>/td> /tr> tr> td styletext-align: right>6/td> td styletext-align: center>b/td> td styletext-align: center>d/td> td styletext-align: center>f/td> td styletext-align: center>span>script typemath/tex>H/script>/span>/td> /tr> /tbody>/table>p>The goal is to find the feature that best predicts the value of span>script typemath/tex>Z/script>/span>./p>/div> /li> li classlisting> hr classslender> a href/articles/16/pivoting-mysql-columns-as-dates>h3 classcontrast>MySQL: Columns as Ordered Week Dates/h3>/a> !-- br>span classsmaller>March 8, 2016/span> br/> --> div>p>Let’s say you have data containing some metrics and their values across an ordered set of dates in a week. Since most screens are longer horizontally than vertically, it’s sometimes better to present data where one metric lies in a row and the dates lie in columns, rather than the usual way around./p>p>The usual way we show tables is like this:/p>p>label for classmargin-toggle> ⊕/label>input typecheckbox classmargin-toggle />span classmarginnote> /span>/p>table> thead> tr> th styletext-align: center>date/th> th styletext-align: left>Visitors/th> th styletext-align: left>Orders/th> th styletext-align: left>Revenue/th> th styletext-align: left>Metric4/th> th styletext-align: left>etc./th> /tr> /thead> tbody> tr> td styletext-align: center>2016-02-28/td> td styletext-align: left>1423/td> td styletext-align: left>19/td> td styletext-align: left>900/td> td styletext-align: left>…/td> td styletext-align: left>…/td> /tr> tr> td styletext-align: center>2016-02-29/td> td styletext-align: left>1534/td> td styletext-align: left>38/td> td styletext-align: left>2037/td> td styletext-align: left>…/td> td styletext-align: left>…/td> /tr> tr> td styletext-align: center>2016-03-01/td> td styletext-align: left>2645/td> td styletext-align: left>57/td> td styletext-align: left>5612/td> td styletext-align: left>…/td> td styletext-align: left>…/td> /tr> tr> td styletext-align: center>…/td> td styletext-align: left>…/td> td styletext-align: left>…/td> td styletext-align: left>…/td> td styletext-align: left>…/td> td styletext-align: left>…/td> /tr> /tbody>/table>p>Because most screens are in landscape mode and because we read from left to right, there are times when it makes sense to pivot the table as follows:/p>p>label for classmargin-toggle> ⊕/label>input typecheckbox classmargin-toggle />span classmarginnote> /span>/p>table> thead> tr> th>metric/th> th>2016-02-28/th> th>2016-02-29/th> th>2016-03-01/th> th>…/th> /tr> /thead> tbody> tr> td>Visitors/td> td>1423/td> td>1534/td> td>2645/td> td>…/td> /tr> tr> td>Orders/td> td>19/td> td>38/td> td>57/td> td>…/td> /tr> tr> td>Revenue/td> td>900/td> td>2037/td> td>5612/td> td>…/td> /tr> tr> td>Metric4/td> td>…/td> td>…/td> td>…/td> td>…/td> /tr> tr> td>Metric5/td> td>…/td> td>…/td> td>…/td> td>…/td> /tr> tr> td>etc./td> td>…/td> td>…/td> td>…/td> td>…/td> /tr> /tbody>/table>p>This may not be “tidy data” as defined by Hadley Wickham in his a hrefhttps://www.jstatsoft.org/article/view/v059i10>excellent paper/a>, but pivoting as such results in easier navigation/scrolling when you have more metrics than dates./p>/div> /li> li classlisting> hr classslender> a href/articles/15/deriving-normal-equation>h3 classcontrast>Deriving the Normal Equation/h3>/a> !-- br>span classsmaller>November 22, 2015/span> br/> --> div>p>Consider a linear model/p>div classmathblock>script typemath/tex; modedisplay>X\vec{\theta} \vec{y}/script>/div>p>where/p>div classmathblock>script typemath/tex; modedisplay>X \left( \begin{array}{ccc}x_{1,1} & \dots & x_{1,n} \\\vdots & \ddots & \vdots \\x_{m,1} & \dots & x_{m,n} \end{array} \right)/script>/div>p>is a matrix of real numbers with span>script typemath/tex>m/script>/span> as the number of samples (or rows), and span>script typemath/tex>n/script>/span> is the number of features (or columns),/p>div classmathblock>script typemath/tex; modedisplay>\vec{\theta} \left( \begin{array}{c}\theta_1 \\\vdots \\\theta_m\end{array} \right)/script>/div>p>is a matrix (also called a em>vector/em>) of coefficients span>script typemath/tex>\theta_i/script>/span>, and/p>div classmathblock>script typemath/tex; modedisplay>\vec{y} \left( \begin{array}{c}y_1 \\\vdots \\y_m\end{array} \right)/script>/div>p>is a matrix of target variables span>script typemath/tex>y_i/script>/span> per ith sample./p>/div> /li> li classlisting> hr classslender> a href/articles/15/beginners-guide-using-data-assess-business-performance>h3 classcontrast>A Beginners Guide on Using Data to Assess Business Performance/h3>/a> !-- br>span classsmaller>September 1, 2015/span> br/> --> div>p>Running an online business that’s growing slower than projected is never an ideal scenario. What can tremendously help diagnose the problem is to have data and know how to gain insights from it. It is only through the collection and analysis of data where you can free yourself from guesswork, start validating assumptions, and gain insights on how you should be operating your business./p>/div> /li> /ul> /article> span classprint-footer>Writings - Adler Santos/span> footer> hr classslender> ul classfooter-links> li> a href//www.twitter.com/adlersantos>span classicon-twitter>/span>/a> /li> /ul>div classcredits>span>© 2018 ADLER SANTOS/span>/br> br>span>Powered by a href//github.com/clayh53/tufte-jekyll>Tufte theme for Jekyll/a>./span>/div>/footer> /body>/html>
Port 443
HTTP/1.1 200 OKConnection: keep-aliveContent-Length: 18991Server: GitHub.comContent-Type: text/html; charsetutf-8permissions-policy: interest-cohort()Last-Modified: Fri, 30 Mar 2018 23:49:34 GMTAccess-Control-Allow-Origin: *ETag: 5abecd0e-4a2fexpires: Sat, 24 Aug 2024 09:02:28 GMTCache-Control: max-age600x-proxy-cache: MISSX-GitHub-Request-Id: 8019:10DA:9069D2:93D7BB:66C99F4CAccept-Ranges: bytesAge: 0Date: Sat, 24 Aug 2024 08:52:28 GMTVia: 1.1 varnishX-Served-By: cache-bfi-kbfi7400094-BFIX-Cache: MISSX-Cache-Hits: 0X-Timer: S1724489549.743339,VS0,VE72Vary: Accept-EncodingX-Fastly-Request-ID: a74b34159e258ebf4d0331f047d9b9f81172d431 !DOCTYPE html>html> head> meta charsetutf-8> meta http-equivX-UA-Compatible contentIEedge> meta nameviewport contentwidthdevice-width, initial-scale1> title>Writings/title> meta namedescription contentData Scientist and Software Engineer> !-- Google Fonts loaded here depending on setting in _data/options.yml true loads font, blank does not--> link href//fonts.googleapis.com/css?familyLato:400,400italic relstylesheet typetext/css> !-- Load up MathJax script if needed ... specify in /_data/options.yml file--> script typetext/javascript src//cdn.mathjax.org/mathjax/latest/MathJax.js?configTeX-AMS-MML_HTMLorMML>/script> link relstylesheet typetext/css href/css/tufte.css> !-- link relstylesheet typetext/css href/css/print.css mediaprint> --> link relcanonical href/>/head> body classfull-width> !--- Header and nav template site-wide -->header> nav classgroup> a href/>img classbadge src/assets/img/avatar.png altCH>/a> a classactive href/ classactive>Writings/a> a href/about/>About Adler/a> a href/css/print.css>/a> /nav>/header> h1>Writings/h1> article> ul classcontent-listing > li classlisting> hr classslender> a href/articles/18/transforming-python-list-to-spark-dataframe>h3 classcontrast>Transforming Python Lists into Spark Dataframes/h3>/a> !-- br>span classsmaller>March 30, 2018/span> br/> --> div>p>Data represented as dataframes are generally much easier to transform, filter, or write to a target source. In Spark, loading or querying data from a source will automatically be loaded as a dataframe./p>p>Here’s an example of loading, querying, and writing data using PySpark and SQL:/p>div classlanguage-python highlighter-rouge>pre classhighlight>code>span classkn>import/span> span classnn>pyspark/span>span classc># Define your SparkContext and SparkSession/span>span classn>sc/span> span classo>/span> span classn>pyspark/span>span classo>./span>span classn>context/span>span classo>./span>span classn>SparkContext/span>span classp>(/span>span classn>master/span>span classo>/span>span classs>host/span>span classp>,/span> span classn>appName/span>span classo>/span>span classs>Sample App/span>span classp>)/span>span classn>session/span> span classo>/span> span classn>pyspark/span>span classo>./span>span classn>sql/span>span classo>./span>span classn>session/span>span classo>./span>span classn>SparkSession/span>span classp>(/span>span classn>sc/span>span classp>)/span>span classs>Load your data using: - spark.read.json(some/path/or/url) - spark.read.parquet(some/path/or/url) - spark.read.csv(some/path/or/url) - spark.read.text(some/path/or/url), etc./span>span classn>data/span> span classo>/span> span classn>spark/span>span classo>./span>span classn>read/span>span classo>./span>span classn>json/span>span classp>(/span>span classs>some/path/or/url/span>span classp>)/span> span classn>data/span>span classo>./span>span classn>createOrReplaceTempView/span>span classp>(/span>span classs>table/span>span classp>)/span>span classc># Apply some SQL query to the data, which results in a DataFrame/span>span classn>df/span> span classo>/span> span classn>session/span>span classo>./span>span classn>sql/span>span classp>(/span>span classs> select col1, col2, sum(col3) from table where col4 some_val group by col1, col2/span>span classp>)/span>span classc># Write the query results to a target in your desired format (say, JSON)/span>span classn>df/span>span classo>./span>span classn>write/span>span classo>./span>span classn>json/span>span classp>(/span>span classs>target/path//span>span classp>)/span>/code>/pre>/div>p>The example above works conveniently if you can easily load your data as a dataframe using PySpark’s built-in functions. But sometimes you’re in a situation where your processed data ends up as a list of Python dictionaries, say when you weren’t required to use code classhighlighter-rouge>spark.read/code> and/or code classhighlighter-rouge>session.sql/code>. How can you load your data as a Spark DataFrame in order to take advantage of its capabilities?/p>/div> /li> li classlisting> hr classslender> a href/articles/18/logistic-regression-loss-cost-forms>h3 classcontrast>How the loss and cost functions got their forms in logistic regression/h3>/a> !-- br>span classsmaller>March 10, 2018/span> br/> --> div>p>In logistic regression, the loss function span>script typemath/tex>\mathcal{L}(\hat{y}, y)/script>/span> and the cost function J take the forms/p>div classmathblock>script typemath/tex; modedisplay>\begin{align}\mathcal{L}(\hat{y}, y) & - \left(y \log (\hat{y}) + (1 - y) \log (1 - \hat{y}) \right) \\J & \frac{1}{m} \sum_{i1}^m \mathcal{L}\left(\hat{y}^{(i)}, y^{(i)}\right)\end{align}/script>/div>p>where span>script typemath/tex>\hat{y} \sigma(W^\text{T} X + b)/script>/span> is the sigmoid of the linear superposition of the features represented in matrix form, with span>script typemath/tex>W/script>/span> as the vector of the feature weights, and span>script typemath/tex>b/script>/span> as the intercept term. The sigmoid of some function span>script typemath/tex>z/script>/span> is defined as/p>div classmathblock>script typemath/tex; modedisplay>\sigma(z) \frac{1}{1 + e^{-z}}./script>/div>p>To understand why span>script typemath/tex>\mathcal{L}/script>/span> and span>script typemath/tex>J/script>/span> take such forms, first note that span>script typemath/tex>\hat{y}/script>/span> is the probability of the binary classification variable span>script typemath/tex>y/script>/span> to be equal to a positive example (span>script typemath/tex>y 1/script>/span>)./p>/div> /li> li classlisting> hr classslender> a href/articles/18/deep-work-taking-breaks>h3 classcontrast>Using Breaks To Get More Deep Work Done/h3>/a> !-- br>span classsmaller>January 27, 2018/span> br/> --> div>p>It’s a great thing to constantly have goals that require prolonged periods of deep concentration. This is something I always look forward to. Deep work gives us a sense of great accomplishment when we’re finished, as well as having expanded our expertise on the domains we’ve tackled during the process./p>p>But of course, many of us don’t buy into the “delayed gratification” thing perhaps due to biological and historical reasonslabel forhistory-drag classmargin-toggle sidenote-number>/label>input typecheckbox idhistory-drag classmargin-toggle />span classsidenote>Sadly, many of us never liked school in the past, and would be happy to never go back to school again. /span>. Nature, by default, always follows the path of lowest energy and/or least resistance./p>/div> /li> li classlisting> hr classslender> a href/articles/18/how-are-bubbles-formed>h3 classcontrast>How Are Bubbles Formed?/h3>/a> !-- br>span classsmaller>January 8, 2018/span> br/> --> div>p>I’ve always thought of a bubble as strong>a compounded result of residual greed when the optimism of the many are perpetually validated/strong>./p>p>This reminds me of what Warren Buffett tells us to do when we see compelling evidence of an impending bubble:/p>blockquote> p>“Be fearful when others are greedy, and greedy when others are fearful.” – Warren Buffett/p>/blockquote>p>But only a few have the discipline to do this, because everyone can easily forget about the principle when they’re constantly seduced by social proofs of getting rich by everyone they know, everywhere they look./p>/div> /li> li classlisting> hr classslender> a href/articles/17/changing-your-behaviour-not-who-you-are>h3 classcontrast>On Change vs. Being Who You Are/h3>/a> !-- br>span classsmaller>August 20, 2017/span> br/> --> div>p>One of the things I’ve more understood lately was the concept of adapting how I behave to the context I was in. I’m aware that I’ve been doing this, but I was also constantly questioning it./p>p>I’ve been wondering if I might be betraying myself, or perhaps this is psychologically unhealthy in the long term, to act against my familiar behavioural inclinations./p>/div> /li> li classlisting> hr classslender> a href/articles/17/machine-learning-entropy-and-classification>h3 classcontrast>Machine Learning: Entropy and Classification/h3>/a> !-- br>span classsmaller>April 2, 2017/span> br/> --> div>h2 ida-simple-classification-example>A Simple Classification Example/h2>p>Let’s say we have a dataset with categorical features span>script typemath/tex>P/script>/span>, span>script typemath/tex>Q/script>/span>, span>script typemath/tex>R/script>/span>, and a binary target variable span>script typemath/tex>Z/script>/span>:/p>p>label for classmargin-toggle> ⊕/label>input typecheckbox classmargin-toggle />span classmarginnote> /span>/p>table> thead> tr> th styletext-align: right>id/th> th styletext-align: center>Feature span>script typemath/tex>P/script>/span>/th> th styletext-align: center>Feature span>script typemath/tex>Q/script>/span>/th> th styletext-align: center>Feature span>script typemath/tex>R/script>/span>/th> th styletext-align: center>Target Variable span>script typemath/tex>Z/script>/span>/th> /tr> /thead> tbody> tr> td styletext-align: right>1/td> td styletext-align: center>a/td> td styletext-align: center>c/td> td styletext-align: center>e/td> td styletext-align: center>span>script typemath/tex>G/script>/span>/td> /tr> tr> td styletext-align: right>2/td> td styletext-align: center>b/td> td styletext-align: center>d/td> td styletext-align: center>e/td> td styletext-align: center>span>script typemath/tex>G/script>/span>/td> /tr> tr> td styletext-align: right>3/td> td styletext-align: center>b/td> td styletext-align: center>d/td> td styletext-align: center>f/td> td styletext-align: center>span>script typemath/tex>H/script>/span>/td> /tr> tr> td styletext-align: right>4/td> td styletext-align: center>a/td> td styletext-align: center>d/td> td styletext-align: center>e/td> td styletext-align: center>span>script typemath/tex>G/script>/span>/td> /tr> tr> td styletext-align: right>5/td> td styletext-align: center>a/td> td styletext-align: center>c/td> td styletext-align: center>f/td> td styletext-align: center>span>script typemath/tex>H/script>/span>/td> /tr> tr> td styletext-align: right>6/td> td styletext-align: center>b/td> td styletext-align: center>d/td> td styletext-align: center>f/td> td styletext-align: center>span>script typemath/tex>H/script>/span>/td> /tr> /tbody>/table>p>The goal is to find the feature that best predicts the value of span>script typemath/tex>Z/script>/span>./p>/div> /li> li classlisting> hr classslender> a href/articles/16/pivoting-mysql-columns-as-dates>h3 classcontrast>MySQL: Columns as Ordered Week Dates/h3>/a> !-- br>span classsmaller>March 8, 2016/span> br/> --> div>p>Let’s say you have data containing some metrics and their values across an ordered set of dates in a week. Since most screens are longer horizontally than vertically, it’s sometimes better to present data where one metric lies in a row and the dates lie in columns, rather than the usual way around./p>p>The usual way we show tables is like this:/p>p>label for classmargin-toggle> ⊕/label>input typecheckbox classmargin-toggle />span classmarginnote> /span>/p>table> thead> tr> th styletext-align: center>date/th> th styletext-align: left>Visitors/th> th styletext-align: left>Orders/th> th styletext-align: left>Revenue/th> th styletext-align: left>Metric4/th> th styletext-align: left>etc./th> /tr> /thead> tbody> tr> td styletext-align: center>2016-02-28/td> td styletext-align: left>1423/td> td styletext-align: left>19/td> td styletext-align: left>900/td> td styletext-align: left>…/td> td styletext-align: left>…/td> /tr> tr> td styletext-align: center>2016-02-29/td> td styletext-align: left>1534/td> td styletext-align: left>38/td> td styletext-align: left>2037/td> td styletext-align: left>…/td> td styletext-align: left>…/td> /tr> tr> td styletext-align: center>2016-03-01/td> td styletext-align: left>2645/td> td styletext-align: left>57/td> td styletext-align: left>5612/td> td styletext-align: left>…/td> td styletext-align: left>…/td> /tr> tr> td styletext-align: center>…/td> td styletext-align: left>…/td> td styletext-align: left>…/td> td styletext-align: left>…/td> td styletext-align: left>…/td> td styletext-align: left>…/td> /tr> /tbody>/table>p>Because most screens are in landscape mode and because we read from left to right, there are times when it makes sense to pivot the table as follows:/p>p>label for classmargin-toggle> ⊕/label>input typecheckbox classmargin-toggle />span classmarginnote> /span>/p>table> thead> tr> th>metric/th> th>2016-02-28/th> th>2016-02-29/th> th>2016-03-01/th> th>…/th> /tr> /thead> tbody> tr> td>Visitors/td> td>1423/td> td>1534/td> td>2645/td> td>…/td> /tr> tr> td>Orders/td> td>19/td> td>38/td> td>57/td> td>…/td> /tr> tr> td>Revenue/td> td>900/td> td>2037/td> td>5612/td> td>…/td> /tr> tr> td>Metric4/td> td>…/td> td>…/td> td>…/td> td>…/td> /tr> tr> td>Metric5/td> td>…/td> td>…/td> td>…/td> td>…/td> /tr> tr> td>etc./td> td>…/td> td>…/td> td>…/td> td>…/td> /tr> /tbody>/table>p>This may not be “tidy data” as defined by Hadley Wickham in his a hrefhttps://www.jstatsoft.org/article/view/v059i10>excellent paper/a>, but pivoting as such results in easier navigation/scrolling when you have more metrics than dates./p>/div> /li> li classlisting> hr classslender> a href/articles/15/deriving-normal-equation>h3 classcontrast>Deriving the Normal Equation/h3>/a> !-- br>span classsmaller>November 22, 2015/span> br/> --> div>p>Consider a linear model/p>div classmathblock>script typemath/tex; modedisplay>X\vec{\theta} \vec{y}/script>/div>p>where/p>div classmathblock>script typemath/tex; modedisplay>X \left( \begin{array}{ccc}x_{1,1} & \dots & x_{1,n} \\\vdots & \ddots & \vdots \\x_{m,1} & \dots & x_{m,n} \end{array} \right)/script>/div>p>is a matrix of real numbers with span>script typemath/tex>m/script>/span> as the number of samples (or rows), and span>script typemath/tex>n/script>/span> is the number of features (or columns),/p>div classmathblock>script typemath/tex; modedisplay>\vec{\theta} \left( \begin{array}{c}\theta_1 \\\vdots \\\theta_m\end{array} \right)/script>/div>p>is a matrix (also called a em>vector/em>) of coefficients span>script typemath/tex>\theta_i/script>/span>, and/p>div classmathblock>script typemath/tex; modedisplay>\vec{y} \left( \begin{array}{c}y_1 \\\vdots \\y_m\end{array} \right)/script>/div>p>is a matrix of target variables span>script typemath/tex>y_i/script>/span> per ith sample./p>/div> /li> li classlisting> hr classslender> a href/articles/15/beginners-guide-using-data-assess-business-performance>h3 classcontrast>A Beginners Guide on Using Data to Assess Business Performance/h3>/a> !-- br>span classsmaller>September 1, 2015/span> br/> --> div>p>Running an online business that’s growing slower than projected is never an ideal scenario. What can tremendously help diagnose the problem is to have data and know how to gain insights from it. It is only through the collection and analysis of data where you can free yourself from guesswork, start validating assumptions, and gain insights on how you should be operating your business./p>/div> /li> /ul> /article> span classprint-footer>Writings - Adler Santos/span> footer> hr classslender> ul classfooter-links> li> a href//www.twitter.com/adlersantos>span classicon-twitter>/span>/a> /li> /ul>div classcredits>span>© 2018 ADLER SANTOS/span>/br> br>span>Powered by a href//github.com/clayh53/tufte-jekyll>Tufte theme for Jekyll/a>./span>/div>/footer> /body>/html>
View on OTX
|
View on ThreatMiner
Please enable JavaScript to view the
comments powered by Disqus.
Data with thanks to
AlienVault OTX
,
VirusTotal
,
Malwr
and
others
. [
Sitemap
]