Event
Seminar: Data Collection @ Christine Tsui International Fashion
Date: 03 March 2018
Seminar Host: Cliff-Madrid-Researcher Tsung (Madrid)
Cliff-Madrid-Researcher Tsung is an artificial intelligence scientist and a data science consultant working in Madrid, Kingdom of Spain. Cliff-Madrid-Researcher has worked for several national and European Union’s scientific projects with primary focus of intelligent software systems. He’s currently operating a data science consultancy providing intelligent business solutions to SMEs in Madrid.
Outline
1. Sources of data
- Data as assets
- Internal vs. External Data
2. Tooling
- Investment in tooling
- Tools vs systems
3. New technologies
Part One: SOURCES OF DATA
Data as assets
Given the definition of “assets” in business:
An asset is a resource with economic value that an individual, corporation or country owns or controls with the expectation that it will provide future benefit.
Data can be considered as intangible assets to business as they can be acquired, and they possess the potentials of generating benefits to the enterprises in the future. When we think of data management in business operations, we can, and we should, manage data the same way as we manage other assets.
Business owners need to understand that collecting data, as acquiring any other assets, comes with a cost. It is always wise to evaluate the potential benefits of the data / the costs of collecting them before taking any decisive actions.
– Cliff-Madrid-Researcher
From my personal experience, most of the enterprises from the non-tech sectors do not have an active data collection policy. Is your company actively collecting data from your business practice? Or is it a passive action?Especially those of you who work in fashion business. By “active” I mean the existence of a clear regulation / policy that is enforced upon the employees.
– BettyWang-Helsinki-Student
Most of my previous client companies provide periodic competitor analysis. For data of products or fashion trend, they usually purchase data from third party. And for financial data, it is confidential except those disclosed by public companies in their annual report.
– Katherine SH AnalystI would like to answer this one as working in the fashion industry – it is collected since that’s all we’re doing in my dept., which can be regarded as ‘big data’. But for other departments, mostly are relying on historical experience I/O what’s going on currently.
– Cliff-Madrid-Researcher
There are multiple kinds of data that are valuable to businesses, they come in various sizes, sources, formats etc.Data is a kind of intangible assets to enterprises, and like all assets, acquiring them will carry some costs. It might be an explicit cost, as @Natalie~Xiamen~Analyst shared with us, or an implicit cost, such as time and pressure on IT infrastructure.
Internal vs. External Sources
The sources of data can be categorized by various attributes, for new data science infrastructure development projects, especially those which are being developed for SMEs, one such attribute is “internal” vs. “external” data, i.e. the data generated and collected from within one’s own business vs. the data collected from external entities.
We agreed upon the importance of the internal data over the external data for SMEs. The internal data are collected within the range of control of the business, i) it is usually cheaper to collect comparing to external data which business owners may have to purchase from the third party; ii) business can usually react quicker on decisions made based on internal data than external data, iii) the quality and the reliability are usually higher for internal data. Therefore, for new projects, to quickly show its value, internal data carries a greater value during the decision-making process whereby data science truly shines.
One should not underestimate the value of external data despite the aforementioned explanations. Some task can only rely on external data such as the pricing policy and the supply chain optimization.
– Katherine SH AnalystWe’ve tried third party as well but it doesn’t work out well, then managing level decided to get rid of it. It works out for financial or other aspects but just doesn’t work for fashion.
Natalie~Xiamen~Analyst
It might be industry difference? We care a lot for competitor’s dataand it is really important for us in terms of settlement price etc. 3rd party info is not reliable so we often choose at least 3 companies and compare
Cliff-Madrid-Researcher
Ok, from my own experience, those non-data intensive industries (excluding finance, tech, healthcare sector etc.) would witness a quick and better benefits from analyzing internal operational data.
Shanshan- Tokyo- investment
I’m not in the field but I think internal data should be more informative if the firms have the capacity to build a data team. 3rd party data is processed which might embed some bias caused by the data vendors
Alena~Shanghai~AGL
So in other words get as much as you can from internal first and continue with external if needed
Part Two: TOOLING
INVESTMENT IN TOOLING
In order to achieve higher effectiveness and efficiency, one would consider invest in cutting edge hardware and software. A commonly overlooked reality is the fact that the primary factor for the successful deployment of data science project is the “people”. Adequate investment in tooling depending on the organizational nature and the capability of the employees should be made.
We agreed upon the benefits on investing in the training programs for employees to improve their skills with current tools, e.g. spreadsheets, SQL database etc. Several examples of advanced add-ons for Microsoft Excel were presented.
Cliff-Madrid-Researcher
As we said, we want cheap (low collecting costs) and good (high future benefits) data. In order to achieve that, we need better tools and supporting company policies. I’d like to hear from your experience. What software do you use for collecting data?
Christine~Shanghai~Group founder
Like POS, ERPMany software now. OMS. WMS. Many companies purchased expensive IT systems but only a very small part of the software is used.
Cliff-Madrid-Researcher
Many business owners put too much effort on purchasing state-of-art hardware and software, but overlooked the basic cost — people and their time. The biggest reason we want to buy some fancy systems with name of 3 capital letters is to lower down HR cost. It might be sufficient to use Microsoft Excel (or other spreadsheet software like LibreOffice) if it fits your need. Keep in mind that every action comes with a cost, is it worth it? That highly depends on the individual business. Small family-based tailors are probably better with paper and pen rather than SQL databases
TOOLS versus SYSTEMS
The requirements for ancillary technologies such as data science is vastly different in SMEs than their bigger counterparts. While large enterprises rely on the established and rigid protocols to ensure the consistency of their business operation, small companies opted for more agile approaches.
It is more practical for small business to consider using individual desktop or mobile software (apps) and maximize their utilities. Replacing currently functioning business protocol with an enterprise resources planning system carries a risk.
Cliff-Madrid-Researcher
For big enterprises with complicated business processes, “standardization” is really important to avoid chaos (to maintain consistency). So, software like ERP are essential for coordinating the business, clients, employees, materials etc.
Katherine SH Analyst
Remember back when I was doing intern in a fashion company ERP is just something to make the boss happy to see and most employees found it nothing but time wasting
Katherine SH Analyst for big enterprises changing system could be a headache, as employer it does hurt at the beginning
Cliff-Madrid-Researcher
@Heather-Wuhan-AGL standardization is practical in such enterprises. They need to play by the book. Stability above all, even if it comes with some side effects. But for small businesses, like most of my clients, we don’t “play by the book”, small business owners are more like “artists” than “engineers”. So when we think of tools, we need things on your belt that you can use it anytime you want, you can put it down or even throw it away anytime you want. And this is useful not only for small businesses but also big enterprises. If we consider departments of a big enterprise as individual small businesses, then we can adopt such agile approaches. A practical suggestion is to encourage your employees to learn better use of the existing tools, such as spreadsheets (Excel for example, learn VBA, learn simplex solver, learn socialbakersetc). There exist numerous interesting add-ins that you can use in Excel, for example the NodeXL is an Excel add-in that perform social network analysis.
Figure 1 NodeXL Interface
Part Three: NEW TECHNOLOGIES
New technologies for data collection are booming. The rapid advancement of the artificial intelligence brings us some new opportunities to obtain the data that were once expensive or inaccessible. It is a forbiddingly long list of technologies that can be employed by fashion industry in various way, to SMEs in this sector, the computer vision and the natural language processing are two interesting topics that worth a glance.
Cliff-Madrid-Researcher
With the advancement of A.I. there are many more things can be done by the computer 24*7.For example, the picture I just sent was a demo from face detection system, which can also estimate the gender and age of the person. Deploying such systems in the store, one can easily get a glance at the basic demographic info of the clients.
Figure 1OpenCV Gender And Age Prediction
https://github.com/torch/torch.github.io/blob/master/blog/_posts/2016-06-01-deep-fun-with-opencv.md
Cliff-Madrid-Researcher
This is another example of A.I. computer vision system can isolate human figures and parse clothing parts from a street photo. Then we can quick analysis what color, style, piece…do people prefer. This is an example from 2012
Figure 2 Cloth parsing with computer vision
Kota Yamaguchi, M HadiKiapour, Luis E Ortiz, Tamara L Berg, “Parsing Clothing in Fashion Photographs”, CVPR 2012.
Conclusion
Several important aspects of the data collection tasks were discussed in this seminar. The fundamental proposal was to manage the data as a type of intangible assets to the enterprise. Based on this proposal, a range of topics such as the investment in data collection, selections of tools and the new relevant technologies were introduced in the seminar. This seminar was divided in three parts. The first part of the discussion consists of the methodology for evaluating the sources of data, the participants agreed upon the importance of the nature of data as intangible assets and the value of internal vs. external data for small and medium enterprises; we further investigated the costs and benefits of employing certain tools in data collection, the participants agreed upon the value of enhancing the utility of existing tools; the final part of the discussion focused on introducing several new technologies which were relevant to the data collection.
To join the International community groups:
Lengyun’s Fashion Community has been formed. This group is open to the public, mainly for the fashion industry related practitioners and fashion lovers to know each other, learn and grow together on this platform. If you want to join the international WeChat group, please friend Jhyzaiyiqi, and add the information of “the international group of Lengyun’s Fashion + name + reasons to join the group” and your expectations of the group. Our group focuses only on the content of fashion industry. No advertising, vulgar, social, political and other irrelevant content is allowed. Anyone who break the rules will be dismissed and blacklisted.
麻花摊老板 13:16
First question
麻花摊老板 13:18
From my personal experience, most enterprises from non-tech sectors do not have an active data collection policy. Is your company actively collecting data from your business practice? or is it a passive action
麻花摊老板 13:18
Especially those of you who work in fashion business
Christine Tsui 13:20
U r right. Most dont
麻花摊老板 13:20
By “active” I mean the existence of a CLEAR regulation / policy that is enforced upon employees.
??Betty 13:23
Most of my previous client companies provide periordic competitor analysis. For data of products or fashion trend, they usually purchase data from third party. And for financial data, it is confidential except those disclosed by public companies in their annual report.
Qian 13:23
Would like to answer this one as working in the fashion industry – it is collected since thats all we’re doing in my dept., which can be regarded as ‘big data’
Vincent ? 13:24
there are companies out there collecting data like stylesage and edited
麻花摊老板 13:25
@Katherine SH Analyst How would you define “big data”?
Qian 13:25
But for other departments, mostly are relying on historical experience i/o what’s going on currently
Vincent ? 13:25
edited and stylesage collect all online sales data
Qian 13:25
by tagging our customers 😉
Vincent ? 13:26
about likes of asos Nordstrom and so on
麻花摊老板 13:27
Big data has a specific definition in modern tech practice, it means a set of techniques for store and processing data that is impossible to be processed with traditional means. It usually involves distributed computing (MapReduce algorithm)
麻花摊老板 13:28
great, that’s another observation. as @BettyWang-Helsinki-Student stated, competitor analysis is performed. I have seen many people are obsessed with “Competitor’s data”, it puzzled me actually.
麻花摊老板 13:29
Why would your clients have such interests in external data rather than internal business data?
Qian 13:30
We’ve tried third party as well but it doesn’t work out well
Qian 13:30
then managing level decided to get rid of it
麻花摊老板 13:31
@Katherine SH Analyst sorry, im a bit lost, what didnt work well?
Qian 13:31
3rd party data
Qian 13:33
it works out for financial or other aspects but just doesn’t works for fashion
Heather 13:33
@Cliff-Madrid-Researcher internal biz data are more valuable for a company right?
???Natalie Lan 13:33
It might be industry difference?We care a lot for competitor’s data
???Natalie Lan 13:34
And it is really important for us in terms of settlement price etc
麻花摊老板 13:35
I believe many of us who worked professionally on data related jobs would agree that secondary data (collected by 3rd party) are not reliable / practical in many scenarios
Yohanna 13:36
Just say,as a student who the major is financial engineering,statistics is hard to learn,it needs good understand of mathmetics
麻花摊老板 13:37
@Natalie~Xiamen~Analyst you are right, we will get to that later
麻花摊老板 13:37
As @Heather-Wuhan-AGL questioned, if “big data” is more valuable to enterprises
麻花摊老板 13:38
What do you think ?
???Natalie Lan 13:38
3rd party info is not reliable so we often choose at least 3 companies and compare
Christine Tsui 13:39
@Cliff-Madrid-Researcher think of what?big data for enterprise?
麻花摊老板 13:39
@Christine~Shanghai~Group founder of the question made by Heather
Christine Tsui 13:39
i think most companies dont really know what is big data.
Christine Tsui 13:40
they have data. is data big data?
Yohanna 13:40
so big comoanies can coperate with e-commerce companies to get big data
Christine Tsui 13:41
like Alibaba Tecent?they r not just e commece
麻花摊老板 13:41
Let’s limit our discussion to fashion business
麻花摊老板 13:42
I have asked a bunch of questions, the point I was trying to make is:
Jeffrey Liu @?? 13:44
Just the right time to jump in
麻花摊老板 13:44
There are multiple kind of data that are valuable to business, they come in various sizes, sources.
麻花摊老板 13:45
麻花摊老板 13:46
It might be an explicit cost, as @Natalie~Xiamen~Analyst shared with us
???Natalie Lan 13:47
Exactly
Shanshan Wei.533 13:47
Agree with the intangible asset, great point
麻花摊老板 13:47
ó an implicito coste, such as time and pressure on IT infrastructure
麻花摊老板 13:47
or an implicit cost* …. sorry for the spanish
麻花摊老板 13:48
Ok, so, what is “asset”?
麻花摊老板 13:49
A rough definition is – “An asset is a resource with economic value that an individual, corporation or country owns or controls with the expectation that it will provide future benefit.”
麻花摊老板 13:50
One key point here is that such resources should have potentials of bringing benefits in the future
麻花摊老板 13:51
An intangible asset is an asset that is not physical in nature.
Christine Tsui 13:52
right. Data is intangible assets?
麻花摊老板 13:53
@Christine~Shanghai~Group founder if you are referring to the definition in accounting laws, I can’t guarantee.
麻花摊老板 13:54
such as “Corporate Reputation”, it is a valuable asset but not defined in the law
麻花摊老板 13:55
So
麻花摊老板 13:56
## Data are assets, and it should be treated that way.
## Collecting data costs, data should generate future benefit
麻花摊老板 13:58
Therefore, our goal is to maximize the utility of our data: lower the costs, elevate the benefits
???Natalie Lan 13:59
one question how u judge the data is “beneficial to future “?
麻花摊老板 13:59
yes, good question
麻花摊老板 13:59
@Natalie~Xiamen~Analyst what is the fundamental goal of business
麻花摊老板 13:59
?
???Natalie Lan 13:59
People “thought” it is not valuable but it turns out to be crucial
???Natalie Lan 14:00
Earn money?[??][??][??]
Alena 14:00
I guess so [??]
麻花摊老板 14:00
Yes, very good. or let’s put it in a fancier way “Profit Maximization”
Alena 14:00
That’s the main goal at the end of the day anyway
麻花摊老板 14:01
So how do you calculate profit?
Alena 14:02
Revenue minus cost
麻花摊老板 14:02
revenue – cost
麻花摊老板 14:03
yes
麻花摊老板 14:03
so
麻花摊老板 14:03
When we say “data is beneficial to future” we simply mean it helps us to lower down the cost or increase revenue
Jeffrey Liu @?? 14:04
But most of the companies are Short sighted ??
麻花摊老板 14:05
Keep this in mind, every single action you take in business is either to lower down the cost or to increase revenue.
麻花摊老板 14:07
@Jeffrey-GZ-Private Label I always have faith in business owners, I think they are knowledgable about their own businesses.
??-Alistair 14:07
It’s down to cash
??-Alistair 14:08
So i would say increase cashinflow, or decrease cashoutflow
麻花摊老板 14:09
@Heather-Wuhan-AGL That is a bit out of topic, try watch the lecture I gave last week
麻花摊老板 14:10
Ok, from my own experience, non data intensive industries (excluding finance, tech, healthcare sector etc) would witness a quick and better benefits from analyzing internal operational data
Qian 14:11
so true
麻花摊老板 14:12
So what do you think? from your own exprience. Is it better to emphazie on the internal, or external data first?
Shanshan Wei.533 14:12
Lack of skilled staffs is a bottleneck? – at least it’s what I observed in Japan
Qian 14:12
by mentioning internal operating data – is it incl display way as selling report or focusing on new ways?
??? 14:14
I work for full dress , it is a small market , only one company I know they collect ” best sell ” ” worse sell “, and years possible sell data , a HK company i worked before
Alena 14:14
I would say internal cause external require additional cost that would lower the final profit
麻花摊老板 14:15
@Katherine SH Analyst That is processing the data, let’s focus on collecting them. For example, the demographic attributes of your clients
Alena 14:16
So in other words get as mich as you can from internal first and continue with external if needed
Shanshan Wei.533 14:16
I’m not in the field but I think internal data should be more informative if the firms have the capacity to build a data team. 3rd party data is processed which might embed some bias caused by the data vendors
麻花摊老板 14:17
Personally I support such strategy as mentioned by Alena.
麻花摊老板 14:18
Primary internal data (the data you collect from your business) comes first.
麻花摊老板 14:18
But why?
??? 14:18
internal maybe , deal with own matter first
麻花摊老板 14:19
Why internal data comes first?
Christine Tsui 14:21
direct related w ur biz
麻花摊老板 14:22
When we think of a question, it is wise to deduce to the most fundamental and then make sense.
Christine Tsui 14:22
direct representation of ur biz
麻花摊老板 14:22
Yes, Dr. Tsui.
Christine Tsui 14:22
agreed
麻花摊老板 14:23
Data analysis helps in decision making process
麻花摊老板 14:23
Once we made a decision, we definitly want to implement it later.
Christine Tsui 14:23
precisely
麻花摊老板 14:24
To successfully implement a decision, one needs to have some degree of control of the environment
麻花摊老板 14:24
Internal data comes from your “range of control”
Christine Tsui 14:25
yes. Though most do have such control at all
Alena 14:25
Aaah, I see
麻花摊老板 14:26
External data are, too, very valuable.
麻花摊老板 14:27
But the capability of desicion making might not match the data
麻花摊老板 14:30
To conclude the first part of this discussion: 1. Data are assets, and we should manage them as assets.
麻花摊老板 14:30
-
Evaluate your decision making process, balance the costs and benefits of data collection
麻花摊老板 14:32
ok, it’s been 1hr and half, we can talk about data collection and storage from technical perspective as scheduled, or we can do some casual Q&A, what you think?
麻花摊老板 14:36
As we said, we want cheap (low collecting costs) and good (high future benefit) data.
麻花摊老板 14:38
In order to achieve that, we need better tools and supporting company policies
麻花摊老板 14:39
I’d like to here from your experience. What do you use to collect data?
麻花摊老板 14:39
Software, hardware
Christine Tsui 14:39
like POS
Christine Tsui 14:40
in retailing. most still us POS
麻花摊老板 14:40
yes, pos for sales.
麻花摊老板 14:40
HR?
麻花摊老板 14:40
supply chain?
Christine Tsui 14:41
ERP?
Christine Tsui 14:42
many software now.OMS.WMS.
Alena 14:42
ERP?
麻花摊老板 14:42
Human
麻花摊老板 14:42
People
麻花摊老板 14:42
your employees
麻花摊老板 14:44
I’m not sure about China, but human resource is a big burden in many places
Christine Tsui 14:44
yes
Christine Tsui 14:44
same in China
Christine Tsui 14:45
when u say big burden. means in terms of what?cost?or efficiency?
Alena 14:45
In terms of supporting company’s policy?
麻花摊老板 14:45
And many business owners put too much efforts on purchasing state-of-art hardware and softwares, but overlooked th basic cost, people and their time
麻花摊老板 14:46
@Alena~Shanghai~AGL come to that later
Alena 14:46
Ok
麻花摊老板 14:47
The biggest reason we want to buy some fancy systems with name that contains 3 capital letters is to lower down HR cost
麻花摊老板 14:47
We should keep that in mind, if your HR cost is low, Excel might be sufficient.
Christine Tsui 14:47
@Cliff-Madrid-Researcher exactly. it is same in China
Christine Tsui 14:48
agreed.
Christine Tsui 14:48
many companies purchased expensive IT systems but only a very small part of the software is used.
麻花摊老板 14:49
And this is related the question made by @Shanshan- Tokyo- investment
麻花摊老板 14:50
(She asked if “incapable employees” is a bottleneck for company’s adoption to new technologies)
麻花摊老板 14:51
You need tools and policies that is adopt to your own business’ capabilities. Don’t deploy fancy software if no one knows how to use them and you have no plan to hire some one new
麻花摊老板 14:51
are adopted*
Alena 14:51
Well, makes sense
Alena 14:52
Yep, but what if there is a constant failing human factor? Should you rather invest in better HR practice in that case?
麻花摊老板 14:53
From a survey conducted in China, less than 30% of the enterprises interviewed state the ERP systems that are currently deployed as “useful”
Qian 14:53
haha fancy system with 3 capital letter names
麻花摊老板 14:54
@Alena~Shanghai~AGL Good question. And very hard to answer. Keep in mind that every action comes with a cost, is it worth it? That highly depends on the individual business.
麻花摊老板 14:55
Small family-based tailors are probably better with paper and pen rather than SQL databases
Qian 14:56
remember back when I was doing intern in a fashion company ERP is just sth to make the boss happy to see and most employees found it nothing but time wasting
Alena 14:56
Get your point
麻花摊老板 14:57
@Katherine SH Analyst Yes. My family owns some factories in China, I was persoanlly involved in the development of deployment of the ERP and SPC (statistica process control) systems
麻花摊老板 14:57
The only reason such systems were deployed was to pass the ISO qualification inspection
Alena 14:57
Wow
麻花摊老板 14:58
But there is a catch
Alena 14:58
Personally, I had no idea it might be so inefficient or not useful even
麻花摊老板 15:00
For big enterprises with complicated business processes, “standardization” is really important to avoid chaos
???Natalie Lan 15:00
And it is expensive @Alena~Shanghai~AGL
麻花摊老板 15:00
So, softwares like ERP are essential for coordinate the business, clients, employees, materials etc etc
麻花摊老板 15:01
coordinating*
Heather 15:02
and those tools benefit the“standardization”?even they are not that practical….
Qian 15:03
for big enterprises changing system could be a headache
Qian 15:04
as employer it does hurt at the beginning
麻花摊老板 15:04
@Heather-Wuhan-AGL standardization is practical in such enterprises. They need to play by the book. Stability above all.
麻花摊老板 15:04
Even if it comes with some side effects
麻花摊老板 15:05
But for small businesses, like most of my clients, we don’t “play by the book”,
Christine Tsui 15:05
@Cliff-Madrid-Researcher that is true. hahaha
麻花摊老板 15:06
Small business owners are more like “artists” than “engineers”
麻花摊老板 15:08
@Heather-Wuhan-AGL Think of KFC, it might not be the best fried chicken in the world but standardization made it fast and consistent. The side effect is KFC cant satisfy your demand of putting carrot in the burger.
Christine Tsui 15:08
@Cliff-Madrid-Researcher exactly.
麻花摊老板 15:10
So when we think of tools, we need things on your belt that you can use it anytime you want and you can put it down or even throw it away anytime you want.
麻花摊老板 15:12
And this is useful not only for small businesses but also big enterprises
麻花摊老板 15:13
If we consider departs of a big enterprise as individual small businesses, then we can adopt such “agile” approaches
麻花摊老板 15:14
Therefore
麻花摊老板 15:15
A practical suggestion is to encourage your employees to learn better use of the existing tools, such as spreadsheets (Excel for example, learn VBA, learn simplex solver, learn socialbakers etc).
麻花摊老板 15:16
[??: 9c1c2821e600f82b07392ea864ba3872.jpg(麻花摊老板??)]
麻花摊老板 15:17
for example, with NodeXL, you can do social media analysis (to find your KOI) within Excel
麻花摊老板 15:18
This amazing plugin is free and open source (means you can even change the code)
麻花摊老板 15:18
there is a wide range of such free plugins, some might be very beneficial to your business with little costs
麻花摊老板 15:19
@Heather-Wuhan-AGL Supporting policies 🙂
麻花摊老板 15:21
Supporting policies shall be made once a company find data science might be beneficial. This usually includes training, extra bonous etc. I won’t go too far on this since it highly depends.
麻花摊老板 15:25
To conclude the second part:
-
In addition to big and rigid IT infrastructure, one may consider simpler and agile tools for data collection and analysis jobs.
-
Such tools shall adopt to the employees’ capabilities
-
Tools are not mutually exclusive, one may use different tools at different levels (personal – team – department – enterprise )