Soumil Nitin Shah¶
Bachelor in Electronic Engineering | Masters in Electrical Engineering | Master in Computer Engineering |¶
- Website : https://soumilshah.herokuapp.com
- Github: https://github.com/soumilshah1995
- Linkedin: https://www.linkedin.com/in/shah-soumil/
- Blog: https://soumilshah1995.blogspot.com/
- Youtube : https://www.youtube.com/channel/UC_eOodxvwS_H7x2uLQa-svw?view_as=subscriber
- Facebook Page : https://www.facebook.com/soumilshah1995/
- Email : shahsoumil519@gmail.com
In [10]:
df1.head(4)
Out[10]:
In [13]:
df2.head(4)
Out[13]:
Algorithms and Steps¶
- Convert the email into array
In [35]:
l1 = df1["Emails"].to_list()
l2 = df2["Emails"].to_list()
In [22]:
len(l1)
Out[22]:
In [23]:
len(l2)
Out[23]:
Create a list of Duplicate Email Address along with their index¶
In [24]:
Duplicate = collections.namedtuple("Emails", "index email")
tem =[]
In [25]:
for i in range(0, len(l1)):
for j in range(0, len(l2)):
if (l1[i] == l2[j]):
tem.append(Duplicate(i, l1[i]))
else:
pass
In [29]:
len(tem)
Out[29]:
Create a array with just the index where this duplicate values are¶
In [33]:
Index_remove = []
for x in tem:
Index_remove.append(x.index)
These are index where Duplicate email are there¶
In [44]:
Index_remove[0:12]
Out[44]:
Iterate over Email¶
- chceck is email index exists in index_reove if yes pass else append
- append value would be your unique emails
In [41]:
newList = []
for c,x1 in enumerate(l1):
if c in Index_remove:
pass
else:
newList.append(x1)
- create a Pandas DataFrame
In [42]:
myuni = pd.DataFrame(data={
"UniqueEmail":newList
})
In [43]:
myuni
Out[43]:
In [ ]: