Pricesearcher: The biggest search engine you’ve never heard of

“Hey Siri, what is the cost of an iPad near me?”

In today’s internet, a number of specialist search engines exist to help consumers search for and compare things within a specific niche.

As well as search engines like Google and Bing which crawl the entire web, we have powerful vertical-specific search engines like Skyscanner, Moneysupermarket and Indeed that specialize in surfacing flights, insurance quotes, jobs, and more.

Powerful though web search engines can be, they aren’t capable of delivering the same level of dedicated coverage within a particular industry that vertical search engines are. As a result, many vertical-specific search engines have become go-to destinations for finding a particular type of information – above and beyond even the all-powerful Google.

Yet until recently, one major market remained unsearchable: prices.

If you ask Siri to tell you the cost of an iPad near you, she won’t be able to provide you with an answer, because she doesn’t have the data. Until now, a complete view of prices on the internet has never existed.

Enter Pricesearcher, a search engine that has set out to solve this problem by indexing all of the world’s prices. Pricesearcher provides searchers with detailed information on products, prices, price histories, payment and delivery information, as well as reviews and buyers‘ guides to aid in making a purchase decision.

Founder and CEO Samuel Dean calls Pricesearcher “The biggest search engine you’ve never heard of.” Search Engine Watch recently paid a visit to the Pricesearcher offices to find about the story behind the first search engine for prices, the technical challenge of indexing prices, and why the future of search is vertical.

Pricesearcher: The early days

A product specialist by background, Samuel Dean spent 16 years in the world of ecommerce. He previously held a senior role at eBay as Head of Distributed Ecommerce, and has carried out contract work for companies including Powa Technologies, Inviqa and the UK government department UK Trade & Investment (UKTI).

He first began developing the idea for Pricesearcher in 2011, purchasing the domain Pricesearcher.com in the same year. However, it would be some years before Dean began work on Pricesearcher full-time. Instead, he spent the next few years taking advantage of his ecommerce connections to research the market and understand the challenges he might encounter with the project.

“My career in e-commerce was going great, so I spent my time talking to retailers, speaking with advisors – speaking to as many people as possible that I could access,” explains Dean. “I wanted to do this without pressure, so I gave myself the time to formulate the plan whilst juggling contracting and raising my kids.”

More than this, Dean wanted to make sure that he took the time to get Pricesearcher absolutely right. “We knew we had something that could be big,” he says. “And if you’re going to put your name on a vertical, you take responsibility for it.”

Dean describes himself as a “fan of directories”, relating how he used to pore over the Yellow Pages telephone directory as a child. His childhood also provided the inspiration for Pricesearcher in that his family had very little money while he was growing up, and so they needed to make absolutely sure they got the best price for everything.

Dean wanted to build Pricesearcher to be the tool that his family had needed – a way to know the exact cost of products at a glance, and easily find the cheapest option.

“The world of technology is so advanced – we have self-driving cars and rockets to Mars, yet the act of finding a single price for something across all locations is so laborious. Which I think is ridiculous,” he explains.

Despite how long it took to bring Pricesearcher to inception, Dean wasn’t worried that someone else would launch a competitor search engine before him.

“Technically, it’s a huge challenge,” he says – and one that very few people have been willing to tackle.

There is a significant lack of standardization in the ecommerce space, in the way that retailers list their products, the format that they present them in, and even the barcodes that they use. But rather than solve this by implementing strict formatting requirements for retailers to list their products, making them do the hard work of being present on Pricesearcher (as Google and Amazon do), Pricesearcher was more than willing to come to the retailers.

“Our technological goal was to make listing products on Pricesearcher as easy as uploading photos to Facebook,” says Dean.

As a result, most of the early days of Pricesearcher were devoted to solving these technical challenges for retailers, and standardizing everything as much as possible.

In 2014, Dean found his first collaborator to work with him on the project: Raja Akhtar, a PHP developer working on a range of ecommerce projects, who came on board as Pricesearcher’s Head of Web Development.

Dean found Akhtar through the freelance website People Per Hour, and the two began working on Pricesearcher together in their spare time, putting together the first lines of code in 2015. The beta version of Pricesearcher launched the following year.

For the first few years, Pricesearcher operated on a shoestring budget, funded entirely out of Dean’s own pocket. However, this didn’t mean that there was any compromise in quality.

“We had to build it like we had much more funding than we did,” says Dean.

They focused on making the user experience natural, and on building a tool that could process any retailer product feed regardless of format. Dean knew that Pricesearcher had to be the best product it could possibly be in order to be able to compete in the same industry as the likes of Google.

“Google has set the bar for search – you have to be at least as good, or be irrelevant,” he says.

PriceBot and price data

Pricesearcher initially built up its index by directly processing product feeds from retailers. Some early retail partners who joined the search engine in its first year included Amazon, Argos, IKEA, JD Sports, Currys and Mothercare. (As a UK-based search engine, Pricesearcher has primarily focused on indexing UK retailers, but plans to expand more internationally in the near future).

In the early days, indexing products with Pricesearcher was a fairly lengthy process, taking about 5 hours per product feed. Dean and Akhtar knew that they needed to scale things up dramatically, and in 2015 began working with a freelance dev ops engineer, Vlassios Rizopoulos, to do just that.

Rizopoulos‘ work sped up the process of indexing a product feed from 5 hours to around half an hour, and then to under a minute. In 2017 Rizopoulos joined the company as its CTO, and in the same year launched Pricesearcher’s search crawler, PriceBot. This opened up a wealth of additional opportunities for Pricesearcher, as the bot was able to crawl any retailers who didn’t come to them directly, and from there, start a conversation.

“We’re open about crawling websites with PriceBot,” says Dean. “Retailers can choose to block the bot if they want to, or submit a feed to us instead.”

For Pricesearcher, product feeds are preferable to crawl data, but PriceBot provides an option for retailers who don’t have the technical resources to submit a product feed, as well as opening up additional business opportunities. PriceBot crawls the web daily to get data, and many retailers have requested that PriceBot crawl them more frequently in order to get the most up-to-date prices.

Between the accelerated processing speed and the additional opportunities opened up by PriceBot, Pricesearcher’s index went from 4 million products in late 2016 to 500 million in August 2017, and now numbers more than 1.1 billion products. Pricesearcher is currently processing 2,500 UK retailers through PriceBot, and another 4,000 using product feeds.

All of this gives Pricesearcher access to more pricing data than has ever been accumulated in one place – Dean is proud to state that Pricesearcher has even more data at its disposal than eBay. The data set is unique, as no-one else has set out to accumulate this kind of data about pricing, and the possible insights and applications are endless.

At Brighton SEO in September 2017, Dean and Rizopoulos gave a presentation entitled, ‘What we have learnt from indexing over half a billion products‘, presenting data insights from Pricesearcher’s initial 500 million product listings.

The insights are fascinating for both retailers and consumers: for example, Pricesearcher found that the average length of a product title was 48 characters (including spaces), with product descriptions averaging 522 characters, or 90 words.

Less than half of the products indexed – 44.9% – included shipping costs as an additional field, and two-fifths of products (40.2%) did not provide dimensions such as size and color.

Between December 2016 and September 2017, Pricesearcher also recorded 4 billion price changes globally, with the UK ranking top as the country with the most price changes – one every six days.

It isn’t just Pricesearcher who have visibility over this data – users of the search engine can benefit from it, too. On February 2nd, Pricesearcher launched a new beta feed which displays a pricing history graph next to each product.

This allows consumers to see exactly what the price of a product has been throughout its history – every rise, every discount – and use this to make a judgement about when the best time is to buy.

“The product history data levels the playing field for retailers,” explains Dean. “Retailers want their customers to know when they have a sale on. This way, any retailer who offers a good price can let consumers know about it – not just the big names.

“And again, no-one else has this kind of data.”

As well as giving visibility over pricing changes and history, Pricesearcher provides several other useful functions for shoppers, including the ability to filter by whether a seller accepts PayPal, delivery information and a returns link.

This is, of course, if retailers make this information available to be featured on Pricesearcher. The data from Pricesearcher’s initial 500 million products shed light on many areas where crucial information was missing from a product listing, which can negatively impact a retailer’s visibility on the search engine.

Like all search engines, Pricesearcher has ranking algorithms, and there are certain steps that retailers can take to optimize for Pricesearcher, and give themselves the best chance of a high ranking.

With that in mind, how does ‘Pricesearcher SEO‘ work?

How to rank on Pricesearcher

At this stage in its development, Pricesearcher wants to remove the mystery around how retailers can rank well on its search engine. Pricesearcher’s Retail Webmaster and Head of Search, Paul Lovell, is currently focused on developing ranking factors for Pricesearcher, and conceptualizing an ideal product feed.

The team are also working with select SEO agencies to educate them on what a good product feed looks like, and educating retailers about how they can improve their product listings to aid their Pricesearcher ranking.

Retailers can choose to either go down the route of optimizing their product feed for Pricesearcher and submitting that, or optimizing their website for the crawler. In the latter case, only a website’s product pages are of interest to Pricesearcher, so optimizing for Pricesearcher translates into optimizing product pages to make sure all of the important information is present.

At the most basic level, retailers need to have the following fields in order to rank on Pricesearcher: A brand, a detailed product title, and a product description. Category-level information (e.g. garden furniture) also needs to be present – Pricesearcher’s data from its initial 500 million products found that category-level information was not provided in 7.9% of cases.

If retailers submit location data as well, Pricesearcher can list results that are local to the user. Additional fields that can help retailers rank are product quantity, delivery charges, and time to deliver – in short, the more data, the better.

A lot of ‘regular‘ search engine optimization tactics also work for Pricesearcher – for example, implementing schema.org markup is very beneficial in communicating to the crawler which fields are relevant to it.

It’s not only retailers who can rank on Pricesearcher; retail-relevant webpages like reviews and buying guides are also featured on the search engine. Pricesearcher’s goal is to provide people with as much information as possible to make a purchase decision, but that decision doesn’t need to be made on Pricesearcher – ultimately, converting a customer is seen as the retailer’s job.

Given Pricesearcher’s role as a facilitator of online purchases, an affiliate model where the search engine earns a commission for every customer it refers who ends up converting seems like a natural way to make money. Smaller search engines like DuckDuckGo have similar models in place to drive revenue.

However, Dean is adamant that this would undermine the neutrality of Pricesearcher, as there would then be an incentive for the search engine to promote results from retailers who had an affiliate model in place.

Instead, Pricesearcher is working on building a PPC model for launch in 2019. The search engine is planning to offer intent-based PPC to retailers, which would allow them to opt in to find out about returning customers, and serve an offer to customers who return and show interest in a product.

Other than PPC, what else is on the Pricesearcher roadmap for the next few years? In a word: lots.

The future of search is vertical

The first phase of Pricesearcher’s journey was all about data acquisition – partnering with retailers, indexing product feeds, and crawling websites. Now, the team are shifting their focus to data science, applying AI and machine learning to Pricesearcher’s vast dataset.

Head of Search Paul Lovell is an analytics expert, and the team are recruiting additional data scientists to work on Pricesearcher, creating training data that will teach machine learning algorithms how to process the dataset.

“It’s easy to deploy AI too soon,” says Dean, “but you need to make sure you develop a strong baseline first, so that’s what we’re doing.”

Pricesearcher will be out of beta by December of this year, by which time the team intend to have all of the prices in the UK (yes, all of them!) listed in Pricesearcher’s index. After the search engine is fully launched, the team will be able to learn from user search volume and use that to refine the search engine.

The Pricesearcher rocket ship – founder Samuel Dean built this by hand to represent the Pricesearcher mission. It references a comment made by Eric Shmidt to Sheryl Sandberg when she interviewed at Google. When she told him that the role didn’t meet any of her criteria and asked why should she work there, he replied: “If you’re offered a seat on a rocket ship, don’t ask what seat. Just get on.”

At the moment, Pricesearcher is still a well-kept secret, although retailers are letting people know that they’re listed on Pricesearcher, and the search engine receives around 1 million organic searches on a monthly basis, with an average of 4.5 searches carried out per user.

Voice and visual search are both on the Pricesearcher roadmap; voice is likely to arrive first, as a lot of APIs for voice search are already in place that allow search engines to provide their data to the likes of Alexa, Siri and Cortana. However, Pricesearcher are also keen to hop on the visual search bandwagon as Google Lens and Pinterest Lens gain traction.

Going forward, Dean is extremely confident about the game-changing potential of Pricesearcher, and moreover, believes that the future of the industry lies in vertical search. He points out that in December 2016, Google’s parent company Alphabet specifically identified vertical search as one of the biggest threats to Google.

“We already carry out ‘specialist searches‘ in our offline world, by talking to people who are experts in their particular field,” says Dean.

“We should live in a world of vertical search – and I think we’ll see many more specialist search engines in the future.”

Source:: searchenginewatch.com

Top Marketing News: Facebook Tests ‘Downvotes,’ Internet Rages at Google, Pandora Takes Aim

Six Essential Email Marketing Tips [Infographic]
Looking for email marketing success? This six tips can help you — and your emails — reach the right target. MarketingProfs

Google Brings the Popular Stories Format to AMP: Is It Worth Using?
Google announced a new story format for AMP (Accelerated Mobile Pages) this week. Google describes the new shiny thing as „a visual driven format for evolving news consumption on mobile.“ Econsultancy

Internet Rages After Google Removes ‚View Image‘ Button, Bowing to Getty
Google removed the „view image“ button this week in response to a recent lawsuit from Getty. This move has enraged the internet, but was done with hopes to encourage clicks through to the image’s hosting website. Ars Technica

45% Of Marketers Cite Content & Experience Management As Top Priority In 2018
Econsultancy’s Digital Trends 2018 report found that 45% of professionals surveyed cite content and experience management as their top priorities for this year, followed by the 32% that cited analytics. Econsultancy

Top Digital Advertising Trends
MediaPost compiled a research brief to show top digital advertising trends, including evidence that Google and Facebook owned 63% US digital market in 2017. Microsoft made strides, but remains a distant third place with just 4%. MediaPost

Google Announces Two Major Changes to Image Search
Google has announced two major changes to image search — including the previously reported removal of the „view image“ button and the removal of the „search by image“ button. Publishers are happy about these changes, Google search users aren’t so thrilled. Search Engine Journal

Pandora Takes Aim At Spotify And IHeartRadio With Programmatic Audio Ads
AdAge reports: „Pandora said Tuesday that it will now offer its audio inventory programmatically through popular demand-side platforms such as MediaMath, The Trade Desk and AdsWizz.“ AdAge

Snapchat Gives Creators Access to Audience Analytics
Some select content creators on Snapchat are being given access to analytics and data about their audience, such as story views, engagement and demographics. This is only available to those who are part of Snapchat’s Official Stories program. Search Engine Journal

B2B Demand Generation: Marketers‘ Favorite Tactics
Recent research from Demand Gen Report shows that email remains a top demand generation channel for both top and bottom funnel prospects. MarketingProfs

Facebook Is Testing A ‚Downvote‘ Button
CNBC Reports: „Facebook is testing a ‚downvote‘ button that lets users flag and hide comments they deem inappropriate. The social network clarified that it is not a ‚dislike‘ button and the test is running for a small set of people in the U.S. only.“ CNBC

Google To Move More Sites To Mobile-First Index In Coming Weeks
Google plans on rolling more sites into the mobile-first index in the next several weeks. It’s time to make sure your site is optimized for mobile if you haven’t already — the time has finally come. Search Engine Land

On the Lighter Side
Red Stripe Says That, Whatever the Cost, It Will Buy a New Bobsled for Jamaica – AdWeek

TopRank Marketing (And Clients) In the News:

Steve Slater – Your M-Commerce Deep Dive: Data, Trends and What’s Next in the Mobile Retail Revenue World – Big Commerce
Lee Odden – Better Than Bonuses: 4 Motivators That Matter More Than Money – Workfront

We’ll be back next week with more top digital marketing news! If you need more in the meantime, follow @toprank on Twitter or subscribe to our YouTube channel.

The post Top Marketing News: Facebook Tests ‘Downvotes,‘ Internet Rages at Google, Pandora Takes Aim appeared first on Online Marketing Blog – TopRank®.

Source:: toprankblog.com

Top Marketing News: Facebook Tests ‘Downvotes,’ Internet Rages at Google, Pandora Takes Aim

Six Essential Email Marketing Tips [Infographic]
Looking for email marketing success? This six tips can help you — and your emails — reach the right target. MarketingProfs

Google Brings the Popular Stories Format to AMP: Is It Worth Using?
Google announced a new story format for AMP (Accelerated Mobile Pages) this week. Google describes the new shiny thing as „a visual driven format for evolving news consumption on mobile.“ Econsultancy

Internet Rages After Google Removes ‚View Image‘ Button, Bowing to Getty
Google removed the „view image“ button this week in response to a recent lawsuit from Getty. This move has enraged the internet, but was done with hopes to encourage clicks through to the image’s hosting website. Ars Technica

45% Of Marketers Cite Content & Experience Management As Top Priority In 2018
Econsultancy’s Digital Trends 2018 report found that 45% of professionals surveyed cite content and experience management as their top priorities for this year, followed by the 32% that cited analytics. Econsultancy

Top Digital Advertising Trends
MediaPost compiled a research brief to show top digital advertising trends, including evidence that Google and Facebook owned 63% US digital market in 2017. Microsoft made strides, but remains a distant third place with just 4%. MediaPost

Google Announces Two Major Changes to Image Search
Google has announced two major changes to image search — including the previously reported removal of the „view image“ button and the removal of the „search by image“ button. Publishers are happy about these changes, Google search users aren’t so thrilled. Search Engine Journal

Pandora Takes Aim At Spotify And IHeartRadio With Programmatic Audio Ads
AdAge reports: „Pandora said Tuesday that it will now offer its audio inventory programmatically through popular demand-side platforms such as MediaMath, The Trade Desk and AdsWizz.“ AdAge

Snapchat Gives Creators Access to Audience Analytics
Some select content creators on Snapchat are being given access to analytics and data about their audience, such as story views, engagement and demographics. This is only available to those who are part of Snapchat’s Official Stories program. Search Engine Journal

B2B Demand Generation: Marketers‘ Favorite Tactics
Recent research from Demand Gen Report shows that email remains a top demand generation channel for both top and bottom funnel prospects. MarketingProfs

Facebook Is Testing A ‚Downvote‘ Button
CNBC Reports: „Facebook is testing a ‚downvote‘ button that lets users flag and hide comments they deem inappropriate. The social network clarified that it is not a ‚dislike‘ button and the test is running for a small set of people in the U.S. only.“ CNBC

Google To Move More Sites To Mobile-First Index In Coming Weeks
Google plans on rolling more sites into the mobile-first index in the next several weeks. It’s time to make sure your site is optimized for mobile if you haven’t already — the time has finally come. Search Engine Land

On the Lighter Side
Red Stripe Says That, Whatever the Cost, It Will Buy a New Bobsled for Jamaica – AdWeek

TopRank Marketing (And Clients) In the News:

Steve Slater – Your M-Commerce Deep Dive: Data, Trends and What’s Next in the Mobile Retail Revenue World – Big Commerce
Lee Odden – Better Than Bonuses: 4 Motivators That Matter More Than Money – Workfront

We’ll be back next week with more top digital marketing news! If you need more in the meantime, follow @toprank on Twitter or subscribe to our YouTube channel.

The post Top Marketing News: Facebook Tests ‘Downvotes,‘ Internet Rages at Google, Pandora Takes Aim appeared first on Online Marketing Blog – TopRank®.

Source:: toprankblog.com

The SEO’s essential guide to web technology

As an SEO professional, your role will invariably lead you to interactions with people in a wide variety of roles including business owners, marketing managers, content creators, link builders, PR agencies, and developers.

That last one – developers – is a catch-all term that can encompass software engineers, coders, programmers, front- and back-end developers, and IT professionals of various types. These are the folks who write the code and/or generally manage the underlying various web technologies that comprise and power websites.

In your role as an SEO, it may or may not be practicable for you to completely master programming languages such as C++ and Java, or scripting languages such as PHP and JavaScript, or markup languages such as HTML, XML, or the stylesheet language CSS.

And, there are many more programming, scripting, and markup languages out there – it would be a Herculean task to be a master of every kind of language, even if your role is full-time programmer and not SEO.

But, it is essential for you, as an SEO professional, to understand the various languages and technologies and technology stacks out there that comprise the web. When you’re making SEO recommendations, which developers will most likely be executing, you need to understand their mindset, their pain points, what their job is like – and you need to be able to speak their language.

You don’t have to know everything developers know, but you should have a good grasp of what developers do so that you can ask better questions and provide SEO recommendations in a way that resonates with them, and those recommendations are more likely to be executed as a result.

When you speak their language, and understand what their world is like, you’re contributing to a collaborative environment where everyone’s pulling on the same side of the rope for the same positive outcomes.

And of course, aside from building collaborative relationships, being a professional SEO involves a lot of technical detective work and problem detection and prevention, so understanding various aspects of web technology is not optional; it’s mandatory.

Web tech can be complex and intimidating, but hopefully this guide will help make things a little easier for you and fill in some blanks in your understanding.

Let’s jump right in!

The internet vs. the World Wide Web

Most people use these terms interchangeably, but technically the two terms do not mean the same thing, although they are related.

The Internet began as a decentralized network of independent interconnected computers.

The US Department of Defense was involved over time and awarded contracts, including for the development of the ARPANET (Advanced Research Projects Agency Network) project, which was an early packet switching network and first to use TCP/IP (Transmission Control Protocol and Internet Protocol).

The ARPANET project led to “internetworking” where various networks of computers could be joined into a larger “network of networks”.

The development of the World Wide Web is credited to British computer scientist Sir Tim Beners-Lee in the 1980s; he developed linking hypertext documents, which resulted in an information-sharing model built “on top” of the Internet.

Documents (web pages) were specified to be formatted in a markup language called “HTML” (Hypertext Markup Language), and could be linked to each other using “hyperlinks” that users could click to navigate to other web pages.

Further reading:

Web hosting

Web hosting, or hosting for short, are services that allow people and businesses to put a web page or a website on the internet. Hosting companies have banks of computers called “servers” that are not entirely dissimilar in nature to computers you’re already familiar with, but of course there are differences.

There are various types of web hosting companies that offer a range of services in addition to web hosting; such services may include domain name registration, website builders, email addresses, website security services, and more.

In short, a host is where websites are published.

Further reading:

Web servers

A web server is a computer that stores web documents and resources. Web servers receive requests from clients (browsers) for web pages, images, etc. When you visit a web page, your browser requests all the resources/files needed to render that web page in your browser. It goes something like this:

Client (browser) to server: “Hey, I want this web page, please provide all the text, images and other stuff you have for that page.”

Server to client: “Okay, here it is.”

Various factors impact how quickly the web page will display (render) including the speed of the server and the size(s) of the various files being requested.

There are three server types you’ll most often encounter:

  1. Apache is open-source, free software compatible with many operating systems such as Linux. An often-used acronym is “LAMP stack” referring to a bundling of Linux, Apache, MySQL (relational database) and PHP (a server-side scripting language).
  2. IIS stands for “Internet Information Services” and is proprietary software made by Microsoft. An IIS server is often referred to as a “Windows Server” because it runs on Windows NT operating systems.
  3. NGINX – pronounced “Engine X”, is billed as a high-performance server able to also handle load balancing, used as a reverse proxy, and more. Their stated goals and reason for being include outperforming other types of servers.

Further reading:

Server log files

Often shortened to “log files”, these are records of sever activity in response to requests made for web pages and associated resources such as images. Some servers may already be configured to record this activity, others will need to be configured to do so.

Log files are the “reality” of what’s happening with a website and will include information such as the page or file requested, date and time stamp of the request, the user agent making the request, the response type (found, error, redirected, etc.), the referrer, and a few other items such as bytes served and client IP address.

SEOs should get familiar with parsing log files. To go into this topic in more detail, read JafSoft’s explanation of a web server log file sample.

FTP

FTP stands for File Transfer Protocol, and it’s how you upload resource files such as webpages, images, XML Sitemaps, robots.txt files, and PDF files to your web hosting account to make these resource files available and viewable on the Web via browsers. There are free FTP software programs you can use for this purpose.

The interface is a familiar file-folder tree structure where you’ll see your local machine’s files on the left, and the remote server’s files on the right. You can drag and drop local files to the server to upload. Voila, you’ve put files onto the internet! For more detail, Wired has an excellent guide on FTP for beginners.

Domain name

A domain name is a string of (usually) text and is used in a URL (Uniform Resource Locator). Keeping this simple, for the URL https://www.website.com, “website” is the domain name. For more detail, check out the Wikipedia article on domain names.

Root domain & subdomain

A root domain is what we commonly think of as a domain name such as “website” in the URL https://www.website.com. A subdomain is the www. part of the URL. Other examples of subdomains would be news.website.com, products.website.com, support.website.com and so on.

For more information on the difference between a domain and a subdomain, check out this video from HowTech.

URL vs. URI

URL stands for “Universal Resource Locator” (such as https://www.website.com/this-is-a-page) and URI stands for “Uniform Resource Identifier” and is a subset of a full URL (such as /this-is-a-page.html). More info here.

HTML, CSS, and JavaScript

I’ve grouped together HTML, CSS, and JavaScript here not because each don’t deserve their own section here, but because it’s good for SEOs to understand that those three languages are what comprise much of how modern web pages are coded (with many exceptions of course, and some of those will be noted elsewhere here).

HTML stands for “Hypertext Markup Language”, and it’s the original and foundational language of web pages on the World Wide Web.

CSS stands for “Cascading Style Sheets” and is a style sheet language used to style and position HTML elements on a web page, enabling separation of presentation and content.

JavaScript (not to be confused with the programming language “Java”) is a client-side scripting language to create interactive features on web pages.

Further reading:

AJAX & XML

AJAX stands for “Asynchronous JavaScript And XML. Asynchronous means the client/browser and the server can work and communicate independently allowing the user to continue interaction with the web page independent of what’s happening on the server. JavaScript is used to make the asynchronous server requests and when the server responds JavaScript modifies the page content displayed to the user. Data sent asynchronously from the server to the client is packaged in an XML format, so it can be easily processed by JavaScript. This reduces the traffic between the client and the server which increases response time and speed.

XML stands for “Extensible Markup Language” and is similar to HMTL using tags, elements, and attributes and was designed to both store and transport data, whereas HTML is used to display data. For the purposes of SEO, the most common usage of XML is in XML Sitemap files.

Structured data (AKA, Schema.org)

Structured data is markup you can add to the HTML of a page to help search engines better understand the content of the page, or at least certain elements of that page. By using the approved standard formats, you provide additional information that makes it easier for search engines to parse the pertinent data on the page.

Common uses of structured data are to markup certain aspects of recipes, literary works, products, places, events of various types, and much more.

Schema.org was launched on June 2, 2011, as a collaborative effort by Google, Bing and Yahoo (soon after joined by Yandex) to create a common set of agreed-upon and standardized set of schemas for structured data markup on web pages. Since then, the term “Schema.org” has become synonymous with the term “structured data”, and Schema.org structured data types are continually evolving with new types being added with relative frequency.

One of the main takeaways about structured data is that it helps disambiguate data for search engines so they can more easily understand information and data, and that certain marked-up elements may result in additional information being displayed in Search Engines Results Pages (SERPs), such as review stars, recipe cooking times, and so on. Note that adding structured data is not a guarantee of such SERP features.

There are a number of structured data vocabularies that exist, but JSON-LD (JavaScript Object Notation for Linked Data) has emerged as Google’s preferred and recommended method of doing structured data markup per the Schema.org guidelines, but other formats are also supported such as microdata and RDFa.

JSON-LD is easier to add to pages, easier to maintain and change, and less prone to errors than microdata which must be wrapped around existing HML elements, whereas JSON-LD can be added as a single block in the HTML head section of a web page.

Here is the Schema.org FAQ page for further investigation – and to get started using microdata, RDFa and JSON-LD, check out our complete beginner’s guide to Schema.org markup.

Front-end vs. back-end, client-side vs. server-side

You may have talked to a developer who said, “I’m a front-end developer” and wondered what that meant. Of you may have heard someone say “oh, that’s a back-end functionality”. It can seem confusing what all this means, but it’s easily clarified.

“Front-end” and “client-side” both mean the same thing: it happens (executes) in the browser. For example, JavaScript was originally developed as something that executed on a web page in the browser, and that means without having to make a call to the server.

“Back-end” and “server-side” both mean the same thing: it happens (executes) on a server. For example, PHP is a server-side scripting language that executes on the server, not in the browser. Some Content Management Systems (CMS for short) like WordPress use PHP-based templates for web pages, and the content is called from the server to display in the browser.

Programming vs. scripting languages

Engineers and developers do have differing explanations and definitions of terms. Some will say ultimately there’s no differences or that the lines are blurry, but the generally accepted difference between a programming language (like C or Pascal) vs. a scripting language (like JavaScript or PHP) is that a programming language requires an explicit compiling step, whereas human-created, human-readable code is turned into a specific set of machine-language instructions understandable by a computer.

Content Management System (CMS)

A CMS is a software application or a set of related programs used to create and manage websites (or we can use the fancy term “digital content”). At the core, you can use a CMS to create, edit, publish, and archive web pages, blog posts, and articles and will typically have various built-in features.

Using a CMS to create a website means that there is no need to create any code from scratch, which is one of the main reasons CMS‘ have broad appeal.

Another common aspect of CMS‘ are plugins, which can be integrated with the core CMS to extend functionalities which are not part of the core CMS feature list.

Common CMS‘ include WordPress, Drupal, Joomla, ExpressionEngine, Magento, WooCommerce, Shopify, Squarespace, and there are many, many others.

here about Content Management Systems.

Content Delivery Network (CDN)

Sometimes called a “Content Distribution Network”, CDNs are large networks of servers which are geographically dispersed with the goal of serving web content from a server location closer to the client making the request in order to reduce latency (transfer delay).

CDNs cache copies of your web content across these servers, and then servers nearest to the website visitor serve the requested web content. CDNs are used to provide high availability along with high performance. More info here.

HTTPS, SSL, and TLS

Web data is passed between computers via data packets of code. Clients (web browsers) serve as the user interface when we request a web page from a server. HTTP (hypertext transfer protocol) is the communication method a browser uses to “talk to” a server and make requests. HTTPS is the secure version of this (hypertext transfer protocol secure).

Website owners can switch their website to HTTPS to make the connection with users more secure and less prone to “man in the middle attacks” where a third party intercepts or possibly alters the communication.

SSL refers to “secure sockets layer” and is a standard security protocol to establish communication encryption between the server and the browser. TLS, Transport Layer Security, is a more-recent version of SSL

HTTP/1.1 & HTTP/2

When Tim Berners-Lee invented the HTTP protocol in 1989, the computer he used did not have the processing power and memory of today’s computers. A client (browser) connecting to a server using HTTP/1.1 receives information in a sequence of network request-response transactions, which are often referred to as “round trips” to the server, sometimes called “handshakes”.

Each round trip takes time, and HTTPS is an HTTP connection with SSL/TSL layered in which requires yet-another handshake with the server. All of this takes time, causing latency. What was fast enough then is not necessarily fast enough now.

HTTP/2 is the first new version of HTTP since 1.1. Simply put, HTTP/2 allows the server to deliver more resources to the client/browser faster than HTTP/1.1 by utilizing multiplexing, compression, request prioritization, and server push which allows the server to send resources to the client that have not yet been requested.

Further reading:

Application Programming Interface (API)

Application is a general term that, simply put, refers to a type of software that can perform specific tasks. Applications include software, web browsers, and databases.

An API is an interface with an application, typically a database. The API is like a messenger that takes requests, tells the system what you want, and returns the response back to you.

If you’re in a restaurant and want the kitchen to make you a certain dish, the waiter who takes your order is the messenger that communicates between you and the kitchen, which is analogous to using an API to request and retrieve information from a database. For more info, check out Wikipedia’s Application programming interface page.

AMP, PWA, and SPA

If you want to build a website today, you have many choices.

You can build it from scratch using HTML for content delivery along with CSS for look and feel and JavaScript for interactive elements.

Or you could use a CMS (content management system) like WordPress, Magento, or Drupal.

Or you could build it with AMP, PWA, or SPA.

AMP stands for Accelerated Mobile Pages and is an open source Google initiative which is a specified set of HTML tags and various functionality components which are ever-evolving. The upside to AMP is lightning-fast loading web pages when coded according to AMP specifications, the downside is some desired features may not be currently supported, and issues with proper analytics tracking.

Further reading:

PWA stands for Progressive Web App, and it blends the best of both worlds between traditional websites and mobile phone apps. PWAs deliver a native app-like experience to users such as push notifications, the ability to work offline, and create a start icon on your mobile phone.

By using “service workers” to communicate between the client and server, PWAs combines fast-loading web pages with the ability to act like a native mobile phone app at the same time. However, because PWAs are JavaScript frameworks, you may encounter a number of technical challenges.

Further reading:

SPAs – Single Page Applications – are different from traditional web pages which load each page a user requests in a session via repeated communications with the server. SPAs, by contrast, run inside the browser and new pages viewed in a user session don’t require page reloading via server requests.

The primary advantages of SPAs include streamlined and simplified development, and a very fast user experience. The primary disadvantages include potential problems with SEO, due to search engines‘ inconsistent ability to parse content served by JavaScript. Debugging issues can also be more difficult and take up more developer time.

It’s worth noting that future success of each of these web technologies ultimately depends on developer adoption.

Conclusion

Obviously, it would require a very long book to cover each and every bit of web technology, and in sufficient detail, but this guide should provide you, the professional SEO, with helpful info to fill in some of the blanks in your understanding of various key aspects of web technology.

I’ve provided many links in this article that serve as jumping off points for any topics you would like to explore further. There’s no doubt that there are many more topics SEOs need to be conversant with, such as robots.txt files, meta robots tags, rel canonical tags, XML Sitemaps, server response codes, and much more.

In closing, here’s a nice article on the Stanford website titled “How Does The Internet Work?” that you might find interesting reading; you can find that here.

Source:: searchenginewatch.com

Three SEO issues that your SEO report needs to include (but you probably overlook)

Good SEO reporting is tough. There’s so much conflicting and outdated advice in our industry that in many cases, SEOs tend to focus on buzz terms rather than good actionable advice.

I’ve seen hundreds of SEO reports throughout the years and I often have hard time walking out with a good plan of further action when it comes to making a website better optimized.

But it’s not the point of this article. What I’d like to start here is an open-ended discussion: Which SEO issues do you include in your SEO audits that others don’t?

Inspired by some recent SEO audits, here are three important SEO issues that I notice are often overlooked:

1. Trailing slash / No trailing slash

One of the most under-estimated issues with structuring URLs is double-checking whether your URLs work with and without a trailing slash at the end.

Read my old article on the issue which, despite being published all the way back in 2009, still holds true.

To summarize:

  • The best practice is to have both versions work properly
  • Google’s official recommendation is 301-redirecting one version to the other, which is something your client wants to do even if both versions work
  • Apart from accidental broken links (e.g. some HTML editors add / at the end of the URL automatically), this could result in lower rankings and lost traffic/conversions (e.g. when other websites link to the broken version of the page)

To diagnose the issue, I tend to use the free website crawler from SEOchat. For some reason it catches these issues more often than other crawlers.

You can also simply run a couple of random URLs through a header checker to see that no link power is being leaked and no users are being lost.

Further reading: There’s another guide explaining problems and solutions when it comes to a trailing slash.

  • Implement 301 (permanent) redirects from one version to the other through your .htaccess file.
  • If you can’t use permanent redirects, use canonical elements instead – either will reduce the risk of duplicate content [Note: This solution is only valid when both versions actually work, so there will be no broken links]
  • Be consistent with your choice.

2. No H-subheads

A long time ago, we used to call those h2-h3 subheadings “semantic structure”, and we’d recommend using them to give keywords higher prominence.

H1 – H6 tagsbriefly describe the topic of the section they introduce”. They can “be used by user agents, for example, to construct a table of contents for a document automatically“.

Other than that there was no obvious tangible benefit to using them.

These days, everything has changed because using H2 tags and including your keyword in them can get you featured!

Image taken from my Featured Snippets FAQ on Content Marketing Institute, where I explain how

tags can help you get a featured snippet

The featured snippet algorithm is being changed daily, so it is going to be harder and harder to get featured. So far, though, they work like a charm and thanks to that, you have a very obvious reason to convince your clients to start using these tags: They can help you get featured!

Netpeak Spider is an excellent tool to diagnose H-structure of the whole website. It gives a detailed report containing content and number of H1-H6 tags, missing tags, and more:

Netpeak Spider

3. Thin content

How do we define thin content?

  • Little original content on the page (usually, just a paragraph or two)
  • Lack of positive signals (links, clicks/traffic, mentions/shares. The latter is mostly an indicator of user engagement)

Everyone talks about thin content in our industry, but an alarming number of SEO reports fail to include it.

Why so? I see two reasons:

  • Thin content is hard to diagnose
  • Thin content is hard to explain (how to convince a client of a 100,000-page website to invest in editing existing content and consolidating lower-quality pages)

When it comes to diagnosing, I’d like to direct you to the awesome audit template from Annie Cushing. It does include thin content diagnostics and even explains how to find it.

As to explaining the issue to clients, the problem with thin-content pages is that it can negatively affect the whole site. If a search crawler finds a high percentage of thin content on a website, it may decide that the whole site is not of much value either. That’s the essence of the Panda update (which is now part of Google’s algorithm).

For more context, check out this video by Jim Boykin:

Whenever your site is affected, almost always the answer is “You have too much thin content”

That being said, thin content may be a reason of your client’s website slowly but surely losing its rankings. It’s very tough to diagnose:

  1. There are no longer thin-content-related updates being announced.
  2. The loss of rankings is very gradual making it hard to pin-point when and why it started
  3. Pages losing rankings may be of better quality than the rest of your website. Thin content may not have ranked for ages. What’s new is that they may now start negatively effect pages that do rank.

Like duplicate content, it’s not right to call this a penalty. Even though thin content may be dragging your site down, it’s not a penalty. It’s part of Google’s effort to keep its results higher-quality.

What other SEO issues do you consider often overlooked in our industry? Share your insights!

Source:: searchenginewatch.com